System and method for exchanging information over a jacobi mimo channel

ABSTRACT

A method and system for transmitting data over a Jacobi MIMO channel when using channel state feedback.

RELATED APPLICATIONS

This application claims the priority of U.S. provisional patent Ser. No. 61/593,346 filing date Feb. 1, 2012 which is incorporated herein by reference.

BACKGROUND

An explosive demand for data network bandwidth emerged within the last two decades as the use of internet and other related data services increased. This demand exponentially grows at a rate close to 60% per year. This growth rate is not about to slow down considering large bandwidth consuming data services such as cloud computing, multi-media real-time applications etc. are expected to gain importance. The demand for network traffic was economically enabled by wavelength-division multiplexing (WDM) technologies researched and developed in the early 90's. At start, the WDM allowed the optical transport throughput to grow at a rate of 80% per year. However, in the last decade this growth rate experienced a dramatic slow-down to about 20%. This was explained by recent studies on the non-linear Shannon capacities of optical fibers where it was shown that current optical transport systems approach their fundamental limits to within a factor of 2.

It is thus clear that new strategies need to be found to continue supporting the ever growing demand for data network. To that end, intense research efforts were directed in the last years to find new strategies beyond WDM. Multiplexing techniques such as time-division multiplexing (TDM), complex modulation and more recently, polarization-division multiplexing (PDM), were employed in the latest generation of optical systems. The only physical dimension left to be exploit is space. Recent studies showed that space-division multiplexing (SDM) is currently the only known method that allows a substantial increase in optical transport capacities and yet is economically attractive. Thus SDM technology is considered today to be the most promising strategy for next generation optical systems satisfying the network growth for the next decade and beyond.

SDM strategies in wireless communication, where multiple antennas are used at the transmitter and/or receiver, have been extensively researched in the last two decades. The most common statistical model of the wireless channel is the Rayleigh fading—the path gain between each transmit and receive antennas is assumed to have a Normal distribution whereas all path gains are independent. Many important works have been conducted in this field where the capacity, error and outage probabilities were comprehensively analyzed. A further fundamental tradeoff in spatially multiplexed wireless systems was defined and analyzed—the tradeoff between multiplexing, exploiting the multiple antennas for higher transmission rate, and diversity, achieving better error probability by transmitting the same signal through multiple paths.

In optical communication an SDM system uses m parallel transmission paths per wavelength which optimally multiplies the potential throughput of a certain link by a factor of m.

Since m can potentially be chosen very large, SDM technology is highly scalable. These parallel optical paths could be multiple single-mode fiber strands within a fiber cable, multiple cores within a multi-core fiber, or multiple modes within a multi-mode waveguide. In this work we consider the multi-mode fiber, however results are applicable also in all other SDM optical structures.

Now, significant crosstalk between the independent optical paths raises the need for multiple-input multiple-output (MIMO) techniques. However, signal processing for large size MIMO schemes (large m) is currently not feasible in the optical rates. Assuming that higher rates signal processing will be available in the future and having in mind that the procedure of replacing optical fibers to support SDM is long and expensive, one will want to make a long term design. To that end and more, it was proposed to design an optical system that can support relatively large number of paths for future use, but at start to address only some of the paths. Winzer and Foschini discuss this Jacobi channel where simulations of the capacities and outage probabilities were presented. The importance of addressing all paths was shown—zero outage probability can be attained for any transmission rate only when all paths are addressed both at the transmitter and receiver. The outage probability is an important measure in optical systems and is required to be very low. Thus, choosing the number of addressed paths is a very critical design step that highly reflects on the system outage.

SUMMARY

According to an embodiment of the invention there are provided systems, receivers, transmitters and methods for transmitting data over a Jacobi MIMO channel.

BRIEF DESCRIPTION OF THE DRAWINGS

Further features and advantages of the invention will be apparent from the description below. The invention is herein described, by way of example only, with reference to the accompanying drawings, wherein:

FIG. 1 illustrates normalized ergodic capacities;

FIG. 2 illustrates channel characters for different outage probabilities;

FIG. 3 illustrates outage probabilities versus signal to signal to noise ratios;

FIG. 4 illustrates average error probabilities versus signal to noise ratios;

FIG. 5 illustrates various DMT curves;

FIG. 6 illustrates diversity versus multiplexing gains for various reception schemes;

FIG. 7 illustrates ergodic capacities versus signal to noise ratios for various channel models;

FIG. 8A depicts outage probability in a 20 dB SNR for 4×4 unitary channel, for different number of addressed modes at the receiver and transmitter;

FIG. 8B illustrates outage curves for different number of supported modes m;

FIG. 9 illustrates a system according to an embodiment of the invention; and

FIG. 10 illustrates a method according to an embodiment of the invention.

DETAILED DESCRIPTION OF THE DRAWINGS

Recent theoretical results indicate that in optical multimode/multicore communication it is possible to transmit a number of streams over the channel with no outage. There is provided a method and a system that can attain this theoretical performance. There are provided a system and method that use channel state feedback that can attain the theoretical performance. The feedback can be delayed and have low rate. This proposed scheme may be an enabler for the optical multimode/multicore communication. The invention may have application in additional multi-input multi-output communication scenario, e.g. in-line communication, wireless communication, and so on.

There is provided a system (that includes a transmitter), method and a non-transitory computer readable medium that may utilize a communication scheme to may be applied in optical fiber communication. There is a novel trend of using multimode and multicore fibers, where in a single fiber several communication links are established. There is a random “cross talk” and interference between the channels. Unlike the situation in wireless communication, the system, method and computer readable medium in some cases can communicate a number of streams with no outage no matter how the channel leakage behaves.

A transmitter may be provided and may send K information streams from M modes/cores where K is determined by the theory and is smaller than N the number of addressable modes/cores at the transmitter.

The transmitter uses feedback on the channel state to retransmit a linear combination of previous signals from the N-K residual modes/cores. The retransmission is performed in a way that enables the receiver to recover from outage and to apply simple decoding schemes.

A communication scheme is provided and may use feedback on channel state to correct previously sent symbols in multi-input multi-output channel.

The communication scheme may be able to obtain no-outage communication in random channels. No outage is crucial for optical communication. Due to the high rates, the required error rates must be very low, and so outage cannot be tolerated. The communication scheme should work even when the channels has changed when the feedback was received, that is the feedback is “outdated”.

The communication scheme with some adjustments can be applied to wireless communication. It may use data feedback, not only channel feedback.

FIG. 9 illustrates system 900 according to an embodiment of the invention.

System 900 includes receiver 910 and transmitter 920. The receiver 910 and the transmitter 920 are coupled to each other by a communication channel 930. The communication channel 930 can belong to the system 900 but it may not belong to the system 900.

The transmitter 920 may be arranged to transmit using a first number (Mt) of transmission paths, multiple transmitter data streams and a transmitter feedback stream; wherein the transmitter feedback stream comprises transmitter feedback symbols calculated in response to receiver feedback stream symbols. The transmitter data streams include new information while the transmitter feedback stream is responsive to previously transmitted information. The transmitter feedback stream is calculated according to the channel state feedback from the receiver.

The receiver 910 may be arranged to:

-   -   a. receive, during multiple points in time and using a second         number (Mr) of reception paths received signals that represent         the multiple transmitter data streams and the transmitter         feedback stream;     -   b. estimate transfer matrixes of the communication channel that         correspond to the multiple points in time; wherein the transfer         matrixes are portions of unitary matrixes, each unitary matrix         is associated with a point in time of the multiple points in         time; and     -   c. transmit over a feedback channel, a receiver feedback stream         indicative of the transfer matrix.

The feedback channel can differ from the communication channel. Alternatively, the feedback channel can be a part of the communication channel.

The transmitter 920 may be further arranged to perform at least one additional transmission of a transmitter data symbol of the transmitter data streams to guarantee that the receiver is capable of reconstructing the transmitter data symbol with a desired certainly to provide a reconstructed transmitter data symbol. This desired certeiny may be responsive to the statistical characteristics of the communication channel noise—as an ideal reconstruction of the transmitter data symbol takes into account a noise free communication channel.

The receiver 910 may be further arranged to reconstruct the transmitter data streams in response to the reconstructed transmitter data symbol and the plurality of received streams.

Each one of the first number (Mt) and the second number (Mr) is smaller than a maximal number of paths (M) supported by the communication channel.

In order to guarantee the existence of at least some (K) outage free communication paths K should not exceed Mt+Mr−M.

FIG. 9 also illustrates the receiver 910 as including a receiving module 911, transfer matrix estimator 912, receiver feedback module 913 and reconstruction module 914.

The receiving module 911 is arranged to receive, during multiple points in time and using a second number (Mr) of reception paths, a plurality of receiver data streams that represent multiple transmitter data streams and a transmitter feedback stream.

The receiving module 911 is also arranged to receive at least one additional transmission of a transmitter data symbol of the transmitter data streams. The multiple transmitter data streams are transmitted by a transmitter coupled to the receiver via a communication channel, using a first number (Mt) of transmission paths. The transmitter feedback stream comprises transmitter feedback symbols calculated in response to receiver feedback stream symbols.

The transfer matrix estimator 912 is arranged to estimate transfer matrixes of the communication channel that correspond to the multiple points in time.

The receiver feedback module 913 is arranged to transmit over a feedback channel, a receiver feedback stream indicative of the transfer matrix; wherein the receiver feedback stream comprises the receiver feedback stream symbols.

Each transfer matrix may belong to a unitary matrix. The receiver feedback module 913 may transmit another part of the unitary matrix (that includes the transfer matrix). The other part of the unitary matrix can be used for reconstruct the transfer matrix and may be smaller than the transfer matrix.

The reconstruction module 914 is arranged to reconstruct, in response to the reception of the at least one additional transmission of the transmitter data symbol, the transmitter data symbol to provide a reconstructed transmitter data symbol; and reconstruct the transmitter data streams in response to the reconstructed transmitter data symbol and the plurality of receiver streams.

FIG. 9 also illustrates the transmitter 920 as including transmission module 921 and transmitter feedback symbol calculator 922.

The transmission module 921 may be arranged to transmit, using a first number (Mt) of transmission paths, multiple transmitter data streams and a transmitter feedback stream.

The transmitter feedback symbol calculator 922 may be arranged to calculate transmitter feedback stream symbols of the transmitter feedback stream in response to receiver feedback stream symbols. The receiver feedback stream is sent to the transmitter over a feedback channel by a receiver using a second number (Mr) of reception paths. The receiver and the transmitter are coupled to a communication channel.

The receiver feedback stream is indicative of the transfer matrixes. The receiver feedback stream may include the transfer matrixes or any information that will allow the transmitter to calculate the transfer matrixes. For example—each transfer matrix can belong to a unitary matrix and the receiver feedback stream may include parts of unitary matrixes that include the unitary matrixes.

The transmission module 921 is further arranged to perform at least one additional transmission of a transmitter data symbol of the transmitter data streams to guarantee that the receiver is capable of reconstructing the transmitter data symbol with a desired certainly to provide a reconstructed transmitter data symbol.

According to various embodiment of the invention the transmitter may be arranged to transmit the multiple transmitter data streams and the transmitter feedback stream concurrently.

The number of the transmitter data streams may not exceed Mt+Mr−M.

The communication channel 930 may be a multimode optic fiber and the transmission paths are implemented by multi-modes of the transmitter.

The communication channel may be an optical fiber that has multiple cores and the transmission paths are implemented by the multiple cores.

The receiver feedback stream may include the transfer matrixes of the communication channel during the receiving.

The receiver feedback stream may include parts of the unitary matrices that differ from the transfer matrixes of the communication channel during the receiving but facilitate a reconstruction of the transfer matrixes. These parts of the unitary matrixes may be smaller than the transfer matrixes.

The transmitter may be arranged to transmit, at a certain time slot, multiple transmitter data symbols of the multiple transmitter data streams and a transmitter feedback symbol of the transmitter feedback stream, the transmitter feedback symbol is responsive to a transmitter data symbol of the multiple transmitter data streams and to a receiver feedback symbol received during a time slot that precedes the certain time slot.

The transmitter may be arranged to transmit, at a certain time slot, multiple transmitter data symbols of the multiple transmitter data streams and a transmitter feedback symbol of the transmitter feedback stream, the transmitter feedback symbol is responsive to a product of (a) transmitter data symbol of the multiple transmitter data streams and (b) a receiver feedback symbol received during a time slot that precedes the certain time slot.

FIG. 10 illustrates method 1000 according to an embodiment of the invention.

Method 1000 is used for transmitting information over a communication channel coupled between a transmitter and a receiver. Method 1000 starts by initialization stage 1010. This stage may include transmitting date symbols without feedback.

Initialization stage 1010 may be followed by stages 1020, 1030, 1040 and 1050.

Stage 1020 includes transmitting, by the transmitter using a first number (Mt) of transmission paths, multiple transmitter data streams and a transmitter feedback stream; wherein the transmitter feedback stream comprises transmitter feedback symbols calculated in response to receiver feedback stream symbols.

Stage 1030 includes receiving, by the receiver during multiple points in time and using a second number (Mr) of reception paths, a plurality of received streams that represent the multiple transmitter data streams and the transmitter feedback stream. Each one of the first number (Mt) and the second number (Mr) is smaller than a maximal number of paths (M) supported by the communication channel.

Stage 1040 includes estimating by the receiver, transfer matrixes of the communication channel that correspond to the multiple points in time; wherein the transfer matrixes are portions of unitary matrixes, each unitary matrix is associated with a point in time of the multiple points in time;

Stage 1050 includes transmitting by the receiver and over a feedback channel, a receiver feedback stream indicative of parts of the unitary matrixes; wherein the receiver feedback stream comprises the receiver feedback stream symbols.

Stages 1020, 1030, 140 and 1050 may be followed by stage 1060 of performing at least one additional transmission of a transmitter data symbol of the transmitter data streams to guarantee that the receiver is capable of reconstructing the transmitter data symbol with a desired certainly to provide a reconstructed transmitter data symbol.

Stage 1060 may be followed by stage 1070 of reconstructing, by the receiver, the transmitter data streams in response to the reconstructed transmitter data symbol, the plurality of received streams.

Stages 1020, 1030, 1040 and 1050 may include multiple iterative steps. A transmitter feedback symbol is calculated based upon previously received feedback symbol that in turn reflects a transfer matrix of the communication channel at yet a previous point in time.

The following table provides an example of few iterations of stages 1020-1050.

It is assumed, for simplicity of explanation that:

-   -   a. Mt=3, Mr=3, M=4.     -   b. At any given time slot (represented by index i) the         communication channel can be represented by a 3×3 transfer         matrix H11i that is a portion of a 4×4 unitary matrix Ui.

${Ui} = \begin{pmatrix} {H\; 11i} & {U\; 12\; i} \\ {U\; 21i} & {U\; 22i} \end{pmatrix}$

-   -   c. That the delay (k) between a point in time of a transfer         matrix and the reception of the appropriate receiver feedback         symbols is 3 time slots.     -   d. That the transmission of the transmitter data streams end         after p time slots.     -   e. That there is a need to perform additional transmission of         the transmitter data symbol during q time slots, wherein q>3.

TABLE 1 Time slot (index i) Transmitter Receiver 1 Transmitts two transmitter data symbols and a Receives three received default value (for example 0) for the symbols, estimates and sends transmitter feedback symbol - as there is no through the feedback channel receiver feedback stream symbols yet information indicative of a first transfer matrix (as existed at the first time slot) 2 Transmitts two transmitter data symbols and a Receives three reveived default value (for example 0) for the symbols, estimates and sends transmitter feedback symbol - as there is no through the feedback channel receiver feedback stream symbols yet information indicative of a second transfer matrix (as existed at the second time slot). 3 Transmitts two transmitter data symbols and a Receives the three received default value (for example 0) for the symbols, estimates and sends transmitter feedback symbol - as there is no through the feedback channel receiver feedback stream symbols yet information indicative of a third transfer matrix (as existed at the third time slot) 4 Receives first reciever feedback symbols Receives the three received (indicative of the first transfer matrix), symbols, estimates and sends calculates first transmitter feedback signal, through the feedback channel transmits the first transmitter feedback signal information indicative of a as well as two transmitter data symbols. fourth transfer matrix (as existed at the fourth time slot) j Receives (j-3)'th reciever feedback symbols Receives three received (indicative of the (j-3)'th transfer matrix), symbols, estimates and sends calculates (j-3)'th transmitter feedback signal, through the feedback channel transmits the (j-3)'th transmitter feedback information indicative of a j'th signal as well as two transmitter data symbols. transfer matrix (as existed at the j'th time slot) p Receives (p-3)'th reciever feedback symbols Receives three received (indicative of the (p-3)'th transfer matrix), symbols, estimates and sends calculates (p-3)'th transmitter feedback signal, through the feedback channel transmits the (p-3)'th transmitter feedback information indicative of a signal as well as two transmitter data symbols. p'th transfer matrix (as existed at the p'th time slot) p + 1 Receives (p-2)'th reciever feedback symbols Receives the received (indicative of the (p-2)'th transfer matrix), symbols. performs a first additional transmission of the (p-2)'th transmitter data symbols (that were already transmitted). p + 2 Receives (p-1)'th reciever feedback symbols Receives the received (indicative of the (p-1)'th transfer matrix), symbols performs a first additional transmission of the (p-1)'th transmitter data symbols (that were already transmitted). p + 3 Receives p'th reciever feedback symbols Receives the received (indicative of the p'th transfer matrix), symbols. performs a first additional transmission of the (p)'th transmitter data symbols (that were already transmitted). P + 3*q-2 Perfoming a q'th additional transmission of starts to reconstruct the the (p-2)'th transmitter data symbols (p-2)'th transmitter data symbols P + 3*q-1 Perfoming a q'th additional transmission of starts to reconstruct the the (p-1)'th transmitter data symbols (p-1)'th transmitter data symbols P + 3*q Perfoming a q'th additional transmission of starts to reconstruct the the (p)'th transmitter data symbols (p)'th transmitter data symbols In a recursive way, the receiver reconstructs the (p−2)'th, (p−1)'th and (p)'th data symbols, and then (p−5)'th, (p−4)'th and (p−3)'th data symbols and so on.

In a nut shell, the receiver can reconstruct the transmitter data streams by taking into account the fact that the transfer matrixes are part of a unitary matrix. The reconstruction is described in details in appendixes A and B. The calculation of the transmitted feedback symbols is also described in appendixes A and B.

According to various embodiments of the invention the multiple transmitter data streams and the transmitter feedback stream are transmitted concurrently, number of the transmitter data streams does not exceed Mt+Mr−M, channel may be a wireless communication channel, a wired communication channel, a multimode optic fiber, a multi-core optic fiber and the like.

When using a multimode optic fiber the transmission paths may be implemented by multi-modes of the transmitter. Alternatively, the communication channel may be an optical fiber that has multiple cores and the transmission paths are implemented by the multiple cores.

The receiver feedback stream may include the transfer matrixes of the communication channel during the receiving or may include parts of the unitary matrices that differ from the transfer matrixes of the communication channel during the receiving but facilitate a reconstruction of the transfer matrixes. These parts of the unitary matrixes may be smaller than the transfer matrixes.

Some Theoretical Background

A common model for an optical space-division multiplexing (SDM) system that supports M (m) orthogonal (spatial and polarization) propagation modes per wavelength is as follows. There is a unitary coupling among all transmission modes, allowing describing the transfer matrix as m×m unitary matrix, denoted H, where each entry h_(ij) represents the complex path gain from transmitted mode i to received mode j. It is further assumed that the channel matrix H is a random instantiation drawn uniformly from the ensemble of all M×M (m×m) unitary matrices. In many practical situations not all modes are being addressed, that is, the transmitter can excite Mt≦M (m_(t)≦m) modes and the receiver can coherently extract Mr≦M (m_(r)≦m) modes. Hence, neglecting waveguide nonlinearities and ignoring differential modal delays and mode dependent loses (MDL), the channel can be written as:

y=√{square root over (SNR)}H ₁₁ x+z,  (1)

Where xεC^(m) ^(t) is the transmitted signal; yεC^(m) ^(r) is the received signal; the additive noise z has i.i.d circularly symmetric complex Gaussian entries z_(i): CN(0,1), i=1, . . . , m_(r); H₁₁ is the m_(r)×m_(t) sub-matrix of the transfer matrix which can be defined as

${H = \begin{bmatrix} H_{11} & H_{12} \\ H_{21} & H_{22} \end{bmatrix}};$

H₁₁ is further assumed to be known at the receiver; SNR is the average signal-to-noise ratio at each received mode when all channel modes are excited and extracted (m_(r)=m_(t)=m, or equivalently, when H₁₁=H). An equal power constraint on each transmit mode is assumed (which is the common power constraint in optical communication systems), that is, the average signal energy transmitted at each symbol period and from each transmit mode is constrained to be not greater than 1:

E[x _(i) *x _(i)]≦1 ∀i=1, . . . , m _(t).  (2)

The theoretical analysis consider two situations: the ergodic case where the transmission is spread over many channel realizations, and the non-ergodic case where the transmission may face any specific channel behavior. The latter case is more realistic as usually the symbol size is shorter than the channel variations. However, it presents a major challenge—how to communicate when the channel conditions are such that the rate cannot be supported since it faded and its capacity is below the rate. This situation is referred to as “outage”. The theoretical results we obtained, summarized below, analyze the outage probability, and discovered that there situations where no outage is possible.

Non-Ergodic Channel—Outage Analysis

In the non-ergodic case, the mutual information between the input and output of the channel with specific (randomly drawn) H₁₁ is:

|(x;y|H ₁₁ =H ₁₁).  (3)

The receiver knows the realization of H₁₁. However, since the channel is unknown at the transmitter, there may be a non-zero probability that the transmission rate is not supported by the channel instantiation. This probability is termed outage probability and is given by

P _(out)(m _(t) ,m _(r) ,m;R)=Pr[|(x;y|H ₁₁)<R],  (4)

that is, the probability that the capacity of the channel realization is smaller than the transmission rate R (bits/symbol).

It is well known that a circularly symmetric Gaussian zero-mean input distribution maximizes the mutual information of an AWGN channel, so we get

|(x;y|H ₁₁ =H ₁₁)=log det(I _(m) _(r) +SNR·H ₁₁ QH ₁₁ ^(†)),  (5)

where Q is the covariance matrix of the transmitted signal and is chosen such that the outage probability is brought to minimum.

Hence, we can write:

$\begin{matrix} {{P_{out}\left( {m_{t},m_{r},{m;R}} \right)} = {\inf\limits_{\underset{{{Q_{ii} \leq {1{\forall i}}} = 1},\ldots \mspace{14mu},m_{t}}{Q:{Q \pm 0}}}{{\Pr \left\lbrack {{\log \; {\det \left( {I_{m_{r}} + {{{SNR} \cdot H_{11}}{QH}_{11}^{\dagger}}} \right)}} < R} \right\rbrack}.}}} & (6) \end{matrix}$

The outage probability is examined for two cases: m_(t)+m_(r)≦m and m_(t)+m_(r)>m, where for the second case we show that a strictly zero outage probability is achievable for any transmission rate below (m_(t)+m_(r)−m)log(1+SNR).

The case of m_(t)+m_(r)≦m

Let the transmission rate be R=r log(1+SNR) bps/Hz

We have shown theoretically that:

$\begin{matrix} {{{P_{out}\left( {m_{t},m_{r},{m;{r\; {\log \left( {1 + {SNR}} \right)}}}} \right)} = {K_{m_{t},m_{r},m}^{- 1}{\int_{B}{\prod\limits_{i = 1}^{m\; i\; n{\{{m_{t},m_{r}}\}}}{{\lambda_{i}^{{m_{r} - m_{t\;}}}\left( {1 - \lambda_{i}} \right)}^{m - m_{r} - m_{t}}{\prod\limits_{i <}{\left( {\lambda_{i} - \lambda_{j}} \right)^{2}{\lambda}}}}}}}},} & (7) \end{matrix}$

where K_(m) _(t) _(,m) _(r) _(,m) is a normalizing factor and

$B = \begin{Bmatrix} {{{\lambda \text{:}0} \leq \lambda_{1} \leq \ldots \leq \lambda_{m\; i\; n{\{{m_{t},m_{r}}\}}} \leq 1},} \\ {{\prod\limits_{i = 1}^{m\; i\; n{\{{m_{t},m_{r}}\}}}\left( {1 + {{SNR} \cdot \lambda_{i}}} \right)} < \left( {1 + {SNR}} \right)^{r}} \end{Bmatrix}$

is the set that describes the outage event.

This gives an analytical expression to the outage probability.

The case of m_(t)+m_(r)>m

Suppose, again, that the transmission rate is R=r log(1+SNR) bps/Hz, with 0≦r≦min{m_(t),m_(r)}, where the number of addressed modes satisfies m_(t)+m_(t)>m. We have shown that for r<m_(t)+m_(r)−m, the outage probability is strictly zero. For r≧m_(t)+m_(r)−m the outage probability satisfies

P _(out)(m _(t) ,m _(r) ,m;r log(1+SNR))=P _(out)(m−m _(r) ,m−m _(t) ,m;(r−(m _(t) +m _(r) −m))log(1+SNR))  (8)

where the right hand side is given by Equation (7).

3. System and Method for Zero Outage with Channel State Feedback

We now present the new communication scheme for m_(r)×m_(t) MIMO channel, where m_(t)+m_(r)>m. By using simple manipulations, the scheme exploits a (delayed) feedback system to complete the rest m−max{m_(t),m_(r)} singular values to 1. Thus the channel is transformed into m_(t)+m_(r)−m independent SISO channels, supporting m_(t)+m_(r)−m streams (degrees of freedom) with zero outage probability. Furthermore, the scheme provide means for simple decoding—decoding used in single-input single-output (SISO) channel-avoiding the need for complicated MIMO processing

The principles of the scheme are described using the following simple example:

Suppose there are 4 modes in the fiber. Let the transmitter and receiver address 3 out of the 4 available modes, i.e., the transform matrix is 3×3 sub-matrix of 4×4 unitary matrix. According to the analysis above, two degrees of freedom can be communicated to the receiver with zero outage probability. Now, suppose only for the simplicity of the example that the channel instantiation changes independently at each channel use and let

${H^{(i)} = \begin{bmatrix} H_{11}^{(i)} & H_{12}^{(i)} \\ H_{21}^{(i)} & H_{22}^{(i)} \end{bmatrix}};$

be the unitary matrix realization at channel use i.

In addition, suppose that at each channel use i the transmitter has perfect knowledge of H₂₁ ^((i−1)), the (m−m_(r))×m_(t) sub-matrix realization of H^((i−1)). Let the transmitter excite the following three entries vector at each channel use i=1, . . . , n:

${x^{(i)} = \begin{bmatrix} x_{1}^{(i)} \\ x_{2}^{(i)} \\ {H_{21}^{({i - 1})}x^{({i - 1})}} \end{bmatrix}},$

where we define x⁽⁰⁾ to be a vector of zeros. Hence, in each channel use the transmitter communicates two new information bearing symbols and a linear combination of the previous signal.

The received signal at each channel use i=1, . . . , n is

y ^((i))=√{square root over (SNR)}H ₁₁ ^((i)) x ^((i)) +z ^((i)).  a.

Now, since H₁₁ ^((i)) is assumed to be known at the receiver, H₂₁ ^((i)) can be also computed using the orthonormality of H^((i))'s columns. We further assume that the receiver has as a side information the following noisy measure of x^((n))

y _(si) ^((n))=√{square root over (SNR)}H ₂₁ ^((n)) x ^((n)) +z _(si) ^((n)),

where z_(si) ^((n)):CN(0,1) is independent of z^((n)).

Thus, the receiver can construct the following vector

$y^{(n)} = {\begin{bmatrix} H_{11}^{{(n)}\dagger} & H_{21}^{{(n)}\dagger} \end{bmatrix} \cdot \begin{bmatrix} y^{(n)} \\ y_{si}^{(n)} \end{bmatrix}}$

which satisfies

$\begin{matrix} {y^{(n)} = {\begin{bmatrix} y_{1}^{(n)} \\ y_{2}^{(n)} \\ y_{3}^{(n)} \end{bmatrix} = {{\sqrt{SNR}x^{(n)}} + z^{(n)}}}} \\ {{= {{\sqrt{SNR}\begin{bmatrix} x_{1}^{(n)} \\ x_{2}^{(n)} \\ {H_{21}^{({n - 1})}x^{({n - 1})}} \end{bmatrix}} + z^{(n)}}},} \end{matrix}$

where z^((n)) has i.i.d. CN(0,1) entries.

Letting y₃ ^((n)) be y_(si) ^((n−1)), the side information for channel use n−1 and repeating this procedure for i=n−1 to 1 result two streams of measures

$\begin{bmatrix} y_{1}^{(1)} \\ y_{2}^{(1)} \end{bmatrix},\begin{bmatrix} y_{1}^{(2)} \\ y_{2}^{(2)} \end{bmatrix},\ldots \mspace{14mu},{\begin{bmatrix} y_{1}^{(n\;)} \\ y_{2}^{(n)} \end{bmatrix}.}$

Thus, we get two information streams that as if were communicated through two independent SISO channels, each with signal-to-noise ratio SNR (and therefore with zero outage probability).

Note that the scheme is feasible if the side information after channel use n is being conveyed by the transmitter through a negligible number of channel uses (with respect to n) and if a feedback system is employed to communicate H₂₁ ^((i)) to the transmitter after each channel use i

We now formalize the scheme for any m_(r)×m_(t) channel that satisfies m_(t)+m_(r)>m.

Let

$H^{{(i)}\;} = \begin{bmatrix} H_{11}^{(i)} & H_{12}^{(i)} \\ H_{21\;}^{(i)} & H_{22}^{(i)} \end{bmatrix}$

be the unitary matrix realization at channel use i.

We assume perfect knowledge of H₁₁ ^((i)) at the receiver and a noiseless feedback communication with a delay of k channel uses. Since H^((i)) unitary, H₂₁ ^((i)) can be computed from H₁₁ ^((i)) and we assume that the receiver noiselessly communicates H₂₁ ^((i)) to the transmitter. Note that H₂₁ ^((i)) completes H₁₁ ^((i))'s columns into orthonormal columns, thus for m_(t)+m_(r)−m>1 and certain matrix instantiations, the computed H₂₁ ^((i)) is not unique and can be chosen wisely (see Remark 4 below).

Now, the transmitter excites the following signal from the addressed modes at each channel use i=1, . . . , nk:

${x^{(i)} = \begin{bmatrix} x_{1}^{(i)} \\ \vdots \\ x_{m_{t} + m_{r} - m}^{(i)} \\ {H_{21}^{({i - k})}x^{({i - k})}} \end{bmatrix}},$

where x^((i)), for i=−(k−1), . . . , 0, is a vector of zeros.

That is, the transmitter conveys m_(t)+m_(r)−m new information bearing symbols and H₂₁ ^((i−k))x^((i−k)), a linear combination of the signal that was transmitted k channel uses before. Note that since H unitary, the power constraint is left satisfied.

Now, assume the transmitter communicates to the receiver the following measures

y _(si) ^((i))=√{square root over (SNR)}H ₂₁ ^((i)) x ^((i)) +z _(si) ^((i)) ∀i=(n−1)k+1, . . . , nk,

That is, noisy measures of the last k transmitted signals, where z_(si) ^((i)) are independent with i.i.d. CN(0,1) entries. As was shown above, the receiver can use the side information to get

$\begin{matrix} {\begin{bmatrix} y_{1}^{(i)} \\ \vdots \\ y_{m_{t\;}}^{(i)} \end{bmatrix} = {{\sqrt{SNR}x^{(i)}} + z^{(i)}}} \\ {= {{\sqrt{SNR}\begin{bmatrix} x_{1}^{(i)} \\ \vdots \\ x_{m_{t} + m_{r} - m}^{(i)} \\ {H_{21}^{({i - k})}x^{({i - k})}} \end{bmatrix}} + {z^{(i)}.}}} \end{matrix}$

for all i=(n−1)k+1, . . . , nk, where z^((i)) are independent with i.i.d CN(0,1) entries.

Letting

$\begin{bmatrix} y_{m_{t} + m_{r} - m + 1}^{(i)} \\ \vdots \\ y_{m_{t}}^{(i)} \end{bmatrix} = {{\sqrt{SNR}H_{21}^{({i - k})}x^{({i - k})}} + \begin{bmatrix} z_{m_{t} + m_{r} - m + 1}^{(i)} \\ \vdots \\ z_{m_{t}}^{(i)} \end{bmatrix}}$

be the side information y_(si) ^((i−k)) measures for channel use i−k, for all i=(n−1)k+1, . . . , nk, and repeating this procedure for i=(n−1)k till i=1 results in m_(t)+m_(r)−m independent streams of measures

$\begin{bmatrix} y_{1}^{(1)} \\ \vdots \\ y_{m_{t} + m_{r} - m}^{(1)} \end{bmatrix},\ldots \mspace{14mu},{\begin{bmatrix} y_{1}^{({nk})} \\ \vdots \\ y_{m_{t} + m_{r} - m}^{({nk})} \end{bmatrix}.}$

Thus, having noisy measures of the last k symbols, y_(si) ^(((n−1)k+1)), . . . , y_(si) ^((nk)), the receiver can construct m_(t)+m_(r)−m SISO channels, each with a signal-to-noise ratio SNR. Assuming the transmitter can convey these measures using a neglectable number of channel uses (with respect to n, see Remark 3 below), the scheme allows approaching the rate (m_(t)+m_(r)−m) log(1+SNR) with zero outage probability.

Remark 1 (Delayed Feedback):

The scheme exploits a noiseless feedback system to communicate a (possibly) outdated information—the channel realization in previous channel uses. Thus, the feedback is not required to be fast, that is, no limitations on the delay time k.

However, for non-ergodic systems with a short delay time, the feedback may carry information about the current channel realization. Thus, the transmitter can exploit the up-to-date feedback to use more efficient schemes (e.g., water filling). Nevertheless, for systems with a long delay time (e.g., relatively long distance optical fibers), the channel can be regarded as non-ergodic however with an outdated feedback. In these cases our scheme efficiently achieves zero outage probability.

Remark 2 (Simple Decoding):

The scheme constructs m_(t)+m_(r)−m independent streams of measures, each with signal-to-noise SNR. This allows the decoding stage to be simple, where a SISO channel decoder can be used, removing the need for further MIMO signal processing.

Remark 3 (Side Information Measures):

Given noisy measures of the last k transmitted signals,

y _(si) ^((i))=√{square root over (SNR)}H ₂₁ ^((i)) x ^((i)) +z _(si) ^((i)) , ∀i=nk−(k−1), . . . , nk,

where z_(si) ^((i)) are independent with i.i.d. CN(0,1) entries, the scheme can construct m_(t)+m_(r)−m independent streams of measures.

Thus, the transmitter has to convey H₂₁ ^((i))x^((i)), for each i=nk−(k−1), . . . , nk, such that the receiver can extracted a vector of noisy measures with signal-to-noise ratio that is not smaller than SNR. This is feasible with a finite number of channel uses. For example, the repetition scheme can be used to convey these measures, each with a signal-to-noise ratio that is at least SNR. Suppose each H₂₁ ^((i))x^((i)) is conveyed to the receiver within N_(si) channel uses (e.g., for the repetition scheme N_(si)=m_(t)(m−m_(r))). By taking large enough n (with respect to N_(si)) one can approach the rate (m_(t)+m_(r)−m)log(1+SNR).

The repetition scheme can convey the m−m_(r) entries of H₂₁ ^((i))x^((i)) with N_(si)=m_(t)(m−m_(r)) channel uses. In each channel use a single entry is transmitted through a single mode (while all other modes are zero) in a way that all entries are transmitted through all modes.

Remark 4 (The Uniqueness of H₂₁ ^((i))):

The scheme can be further improved to support even an higher data rate with outage probability zero. For example, the last m−m_(r) entries of the transmitted signal at the first k channel uses can be used to excite information bearing symbols instead of the zeros symbols. Furthermore, as was mentioned above, when m_(t)+m_(r)−m>1, H₂₁ ^((i)) is not unique; there are many (m−m_(r))×m_(t) matrices that complete the columns of H₁₁ ^((i)) into orthonormal columns. Thus, the receiver can choose H₂₁ ^((i)) to be the one with the largest number of zeros rows. Now, at time i+k the transmitter excites m_(t)+m_(r)−m new symbols and H₂₁ ^((i))x^((i)), a retransmission of x^((i)), the transmitted signal at time i. With an appropriate choice of H₂₁ ^((i)), H₂₁ ^((i))x^((i)) contains entries that are zero.

Instead, these entries can contain additional new information bearing symbols.

APPENDIX A Mathematical Analysis of the Jacobi Mimo Channel

In our work we analyze the channel with respect to the number of addressed paths in the transmitter and receiver. We start by defining the system model and presenting the channel statistics in Section 2. An interesting transition threshold is revealed—when the number of addressed paths is large enough the statistics of the problem changes. Using this new knowledge we give analytic expressions to the ergodic capacity in Section 3. In Section 4 we analyze the outage probabilities in the non-ergodic channel and show that for large enough number of addressed paths a strictly zero outage probability is obtainable (for certain transmission rates). We further present a new communication scheme that achieves the highest rate possible with no outage. Section 5 discuss the diversity-multiplexing tradeoff of the channel where we show an absorbing difference in the maximum diversity gain between the Rayleigh fading and the Jacobi optical channels. Section 6 summarizes and discuss the results.

2 System Model and Channel Statistics

We consider an optical space-division multiplexing (SDM) system that supports m orthogonal (spatial and polarization) propagation modes per wavelength. We assume a unitary coupling among all transmission modes, allowing us to describe the transfer matrix as m×m unitary matrix, denoted H, where each entry h_(ij) represents the complex path gain from transmitted mode i to received mode j. We further assume a uniformly distributed unitary coupling, that is, the channel matrix H is assumed to be a random instantiation drawn uniformly from the ensemble of all m×m unitary matrices. However, we consider the case where not all modes are being addressed, that is, the transmitter can excite m_(t)≦m modes and the receiver can coherently extract m_(r)≦m modes. Hence, neglecting waveguide nonlinearities and ignoring differential modal delays and mode dependent loses (MDL), the channel can be written as:

y=√{square root over (SNR)}H ₁₁ x+z,  (1)

where xεC^(m) ^(t) is the transmitted signal; yεC^(m) ^(r) is the received signal; the additive noise z has i.i.d circularly symmetric complex Gaussian entries z_(i):CN(0,1), i=1, . . . , m_(r).

H₁₁ is the m_(r)×m_(t) sub-matrix of the transfer matrix which can be defined as

${H = \begin{bmatrix} H_{11} & H_{12} \\ H_{21} & H_{22} \\ \; & \; \end{bmatrix}};$

H₁₁ is further assumed to be known at the receiver; SNR is the average signal-to-noise ratio at each received mode when all channel modes are excited and extracted (m_(r)=m_(t)=m, or equivalently, when H₁₁=H). In this paper we assume an equal power constraint on each transmit mode (which is the common power constraint in optical communication systems), that is, the average signal energy transmitted at each symbol period and from each transmit mode is constrained to be not greater than 1:

E[x _(i) *x _(i)]≦1 ∀i=1, . . . , m _(t).  (2)

Now, to be able to analytically analyze this channel we need to understand the statistics of the channel matrix H₁₁. To that end we will briefly go over three classical random matrix ensembles, the Gaussian (Hermite), Wishart (Laguerre) and Jacobi (MANOVA—multivariate analysis of variance) ensembles. We limit the discussion to complex ensembles.

Definition 1 (Gaussian Ensemble)

G(m,n) is m×n matrix of i.i.d complex entries distributed as CN(0,1).

The following ensembles are constructed from the Gaussian ensemble as follows.

Definition 2 (Wishart Ensemble)

W(m,n), where m≧n, is n×n Hermitian matrix which can be constructed as A^(†)A, where A is G(m,n).

Definition 3 (Jacobi Ensemble)

J(m₁,m₂,n), where m₁,m₂≧n, is n×n Hermitian matrix which can be constructed as A(A+B)⁻¹, where A and B are W(m₁,n) and W(m₂,n), respectively.

The first two ensembles relate to wireless communication. The Gaussian ensemble is the most common statistical model for the wireless MIMO channel and the eigenvalues of the Wishart ensemble share the same distribution with the squared singular values of the channel matrix. We claim here that the third classical ensemble, the Jacobi ensemble, completes the applications of random matrix ensembles in communication by relating to the discussed under-addressed channel.

It was shown that there is a deep connection between the eigenvalues of a Jacobi matrix and the (squared) singular values of a sub-matrix of Haar-distributed unitary matrix. More precisely, let

$U = \begin{bmatrix} U_{11} & U_{12} \\ U_{21} & U_{22} \\ \; & \; \end{bmatrix}$

be m×m Haar-distributed unitary matrix (that is, drawn uniformly from the ensemble of all m×m unitary matrices), where U₁₁ is the m_(r)×m_(t) sub-matrix of U. If m_(r)≧m_(t) and m−m_(r)≧m_(t), the eigenvalues of U₁₁ ^(†)U₁₁ share the same distribution with the eigenvalues of a Jacobi matrix J(m_(r),m−m_(r),m_(t)).

We shall now present the statistics of the eigenvalues of H₁₁ ^(†)H₁₁ using above.

2.1 The Case of m_(t)+m_(r)≦m

For H₁₁ with m_(r)≧m_(t) and m−m_(r)≧m_(t) (which equals m_(t)+m_(r)≦m), according to above arguments, the eigenvalues of H₁₁ ^(†)H₁₁ follow the Jacobi ensemble J(m_(r),m−m_(r),m_(t)). For m_(r)<m_(t), since also H^(†) is Haar distributed, the eigenvalues of H₁₁H₁₁ ^(†) share the same distribution with the eigenvalues of the Jacobi ensemble J(m_(r),m−m_(t),m_(r)). Since A^(†)A and AA^(†) have the same non-zeros eigenvalues, we can say that the statistics of the non-zero eigenvalues of H₁₁ ^(†)H₁₁ follow the Jacobi ensemble,

J(m _(max) ,m−m _(max) ,m _(min)).  (3)

where m_(max)=max{m_(t),m_(r)} and m_(min)=min{m_(t),m_(r)}.

The joint probability density function (pdf.) of the ordered eigenvalues 0≦λ₁≦ . . . ≦λ_(m) _(min) ≦1, of the Jacobi ensemble (3) is well know:

a

${{f_{\lambda}\left( {m_{t},m_{r},{m;\; \lambda_{1}},\ldots \mspace{11mu},\lambda_{\min {\{{m_{t},m_{r}}\}}}} \right)} = {K_{m_{t},m_{r},m}^{- 1}{\prod\limits_{i = 1}^{\min {\{{m_{t},m_{r}}\}}}\; {{\lambda_{i}^{{m_{r} - m_{t}}}\left( {1 - \lambda_{i}} \right)}^{m - m_{r} - m_{t}}{\prod\limits_{i < j}\; \left( {\lambda_{i} - \lambda_{j}} \right)^{2}}}}}},$

Where K_(m) _(t) _(,m) _(r) _(,m) is a normalizing constant. Thus, the joint pdf. of the ordered non-zero eigenvalues of H₁₁ ^(†)H₁₁, for the case of m_(t)+m_(r)≦m, follows (4).

2.2 The Case of m_(t)+m_(r)>m

When the sum of transmitted and received addressed modes, m_(t)+m_(r), is larger than the total available modes, m, the statistics of the singular values change. Having in mind the orthonormality of H's columns, one can think of m_(t)+m_(r)>m as a transition threshold in which the sub-matrix H₁₁ is large enough with respect to H to change the singularity statistics. The following Lemma provides the joint pdf. of H₁₁'s singular values, showing that with probability 1 there are m_(t)+m_(r)−m singular values which are 1.

Lemma 1 suppose

$H = \begin{bmatrix} H_{11} & H_{12} \\ H_{21} & H_{22} \end{bmatrix}$

is Haar distributed m×m unitary matrix, where H₁₁ is m_(r)×m_(t) matrix and m_(t)+m_(r)>m. by denoting m_(max)=max{m_(t),m_(r)} and m_(min)=min{m_(t),m_(r)}, H₁₁ ^(†)H₁₁ has: a. m_(t)+m_(r)−m eigenvalues which are 1. b. m_(t)−m_(min) zero eigenvalues. c. m−m_(max) eigenvalues which are equal to the non-zeros eigenvalues of H₂₂ ^(†)H₂₂, thereby are distributed according to the Jacobi ensemble J(m−m_(min),m_(min),m−m_(max)).

Proof. We denote by λ₁ ^((kj)), . . . , λ_(m) _(t) ^((kj)) the eigenvalues of H_(kj) ^(†)H_(kj) for (k, j)=(1,1),(2,1). We further let {tilde over (λ)}₁ ^((kj)), . . . , λ_(m−m) _(r) ^((kj)) be the eigenvalues of H_(kj) ^(†)H_(kj) for (k, j)=(2,1),(2,2). Since H unitary we can write

H ₁₁ ^(†) H ₁₁ +H ₂₁ ^(†) H ₂₁ =I _(m) _(t) ,  (5)

where here and throughout the rest of the paper we denote by I_(n) the n×n identity matrix.

Thus, we get

λ_(i) ⁽¹¹⁾=1−λ_(i) ⁽²¹⁾ ∀i=1, . . . , m _(t).  (6)

In the same manner we have H₂₁H₂₁ ^(†)+H₂₂H₂₂ ^(†)=I_(m−m) _(r) , thus

{tilde over (λ)}_(i) ⁽²¹⁾=1−{tilde over (λ)}_(i) ⁽²²⁾ ∀i=1, . . . , m−m _(r).  (7)

Now, H₂₂ is (m−m_(r))×m_(t) matrix. Since m−m_(r)<m_(t), H₂₂ ^(†)H₂₂ has (at least) m_(t)+m_(r)−m zero eigenvalues; thus, by applying (6), H₁₁ ^(†)H₁₁ has (at least) m_(t)+m_(r)−m eigenvalues which are 1. Since H_(kj)H_(kj) ^(†) and H_(kj) ^(†)H_(kj) share the same non-zero eigenvalues we can combine (6) and (7) to conclude that the additional m−m_(max) non-zeros eigenvalues of H₁₁ ^(†)H₁₁ are equal to the m−m_(max) eigenvalues of H₂₂ ^(†)H₂₂.

Above arguments hold true for any unitary matrix, in particular for any realization of the transfer matrix H. Noting that H₂₂ is (m−m_(r))×(m−m_(t)) matrix and therefore applies to the first case ((m−m_(r))+(m−m_(t))<m), completes the proof.

3 Ergodic Channel

In this section we assume that the channel is rapidly changing or the signal samples the entire channel statistics. The channel is assumed to be known at the receiver but not at the transmitter. In this case, the mutual information between the input and output of the channel is

E[|(x;y|H ₁₁ =H ₁₁)],  (8)

where the expectation is over H₁₁. Since the channel is fast fading, i.e. the signal samples the entire channel statistics, we average over the channel matrix distribution.

It is well known that a circularly symmetric Gaussian zero-mean input distribution achieves the capacity of this channel, which is given by

$\begin{matrix} {{{C\left( {m_{t},m_{r},{m;{SNR}}} \right)} = {\underset{{{Q_{ii} \leq {1{\forall i}}} = 1},\ldots \mspace{11mu},m_{t}}{\max\limits_{{Q\text{:}\mspace{11mu} Q} \pm 0}}\mspace{14mu} {E\left\lbrack {\log \mspace{11mu} {\det \left( {I_{m_{r}} + {{{SNR} \cdot H_{11}}{QH}_{11}^{\dagger}}} \right)}} \right\rbrack}}},} & (9) \end{matrix}$

Where Q is the covariance matrix of the transmitted signal and is chosen to maximize the average mutual information. The following theorem shows that the identity matrix achieves capacity. We note that because of the equal power constraint per-mode, the capacity is symmetric in m_(t) and m_(r).

Theorem 1

The ergodic capacity of the channel is achieved when the transmitted signal is circularly symmetric Gaussian zero-mean with covariance I_(m) _(t) and is given by

C(m _(t) ,m _(r) ,m;SNR)=E[log det(I _(m) _(t) +SNR·H ₁₁ ^(†) H ₁₁)].  (10)

Proof. The capacity in (9) satisfies:

$\begin{matrix} {{C\left( {m_{t},m_{r},{m;{SNR}}} \right)} = {\underset{{{Q_{ii} \leq {1{\forall i}}} = 1},\ldots \;,m_{t}}{\max\limits_{{Q\text{:}Q} \pm 0}}{E\left\lbrack {\log \; {\det \left( {I_{m_{r}} + {{{SNR} \cdot H_{11}}{QH}_{11}^{\dagger}}} \right)}} \right\rbrack}}} & (11) \\ {\mspace{79mu} {\leq {\underset{{{Q_{ii} \leq {1{\forall i}}} = 1},\ldots \;,m_{t}}{\max\limits_{{Q\text{:}Q} \pm 0}}{{E\left\lbrack {\log \; {\det \left( {I_{m_{r}} + {{{SNR} \cdot H_{11}}{QH}_{11}^{\dagger}}} \right)}} \right\rbrack}.}}}} & (12) \end{matrix}$

It was shown that Q=I_(m) _(t) maximizes (12) for any distribution of H₁₁ that is invariant under unitary permutation. Since Q=I_(m) _(t) satisfies also Q_(ii)≦1∀i=1, . . . , m_(t) we can write

C(m _(t) ,m _(r) ,m;SNR)=E[log det(I _(m) _(r) +SNR·H ₁₁ H ₁₁ ^(†))],  (13)

Where it can be easily shown that the distribution of H₁₁ is invariant under unitary permutation since H is Haar-distributed, that is, invariant under unitary permutation.

To complete the proof we use det(I_(m) _(r) +SNR·H₁₁H₁₁ ^(†))=log det(I_(m) _(t) +SNR·H₁₁ ^(†)H₁₁).

3.1 The Case of m_(t)+m_(r)≦m

The following theorem gives an analytical expression to the ergodic capacity for m_(t)+m_(r)≦m. Using the joint pdf. of the eigenvalues of the Jacobi ensemble we associate the ergodic capacity with the Jacobi polynomials.

Theorem 2

The ergodic capacity, for m_(t)+m_(r)≦m, satisfies

$\begin{matrix} {{{C\left( {m_{t},m_{r},{m;{SNR}}} \right)} = {\frac{1}{m_{\min}}{\sum\limits_{k = 1}^{m_{\min}}\; {b_{k,\alpha,\beta}^{- 1}{\int_{0}^{1}{{\log \left( {1 + {{SNR}\  \cdot \lambda}} \right)}\left( {P_{k}^{({\alpha,\beta})}\left( {1 - {2\lambda}} \right)} \right)^{2}{\lambda^{\alpha}\left( {1 - 2} \right)}^{\beta}{\lambda}}}}}}},} & (14) \end{matrix}$

where we denote m_(min)=min{m_(t),m_(r)}, α=|m_(r)−m_(t)|, β=m−m_(t)−m_(r),

$b_{k,\alpha,\beta} = {{\frac{1}{{2k} + \alpha + \beta + 1}2k} + \alpha + {\beta_{k}2k} + \alpha + {\beta_{k + \alpha}^{- 1}.}}$

and P_(k) ^((α,β))(x) are the Jacobi polynomials

${P_{k}^{({\alpha,\beta})}(x)} = {\frac{\left( {- 1} \right)^{k}}{2^{k}{k!}}\left( {1 - x} \right)^{- \alpha}\left( {1 + x} \right)^{- \beta}{{\frac{^{k}}{x^{k}}\left\lbrack {\left( {1 - x} \right)^{k + \alpha}\left( {1 + x} \right)^{k + \beta}} \right\rbrack}.}}$

Proof. See Appendix 7.

3.2 The Case of m_(t)+m_(r)>m

Here we use Lemma 1 to compute the ergodic capacity:

Theorem 3

The ergodic channel capacity, in case m_(t)+m_(r)>m, is given by C(m_(t),m_(r),m;SNR)=(m_(t)+m_(r)−m)·C(1,1,1;SNR)+C(m−m_(r),m−m_(t),m;SNR), where C(1,1,1;SNR) is the SISO channel capacity C(1,1,1;SNR)=log(1+SNR) and C(m−m_r,m−m_t,m;SNR) is given by Theorem 2—we define C(0,m−m_t,m;SNR)=C(m−m_r,0,m;SN)=0.

Proof. Let

$H = \begin{bmatrix} H_{11} & H_{12} \\ H_{21} & H_{22} \end{bmatrix}$

be the m×m unitary coupling matrix of the channel, where H₁₁ is the m_(r)×m_(t) coupling matrix between the addressed modes.

By Theorem 1 the ergodic capacity is

$\begin{matrix} {{C\left( {m_{t},m_{r},{m;{SNR}}} \right)} = {E\left\lbrack {\log \; {\det \left( {I_{m_{t}} + {{{SNR} \cdot H_{11}^{\dagger}}H_{11}}} \right)}} \right\rbrack}} & (15) \\ {= {{E\left\lbrack {\sum\limits_{i = 1}^{\min {\{{m_{t},m_{r}}\}}}\; {\log \left( {1 + {{SNR} \cdot \lambda_{i}}} \right)}} \right\rbrack}.}} & (16) \end{matrix}$

Where λ₁, . . . , λ_(min{m) _(t) _(,m) _(r) } are the non-zeros eigenvalues of H₁₁ ^(†)H₁₁. According to Lemma 1, m_(t)+m_(r)−m eigenvalues are 1 where the others are equal to the non-zeros eigenvalues of H₂₂ ^(†)H₂₂.

Thus, we can write:

C(m _(t) ,m _(r) ,m;SNR)=(m−m _(r) −m _(t))log(1+SNR)+E[log det(I _(m−m) _(t) +SNR·H ₂₂ ^(†) H ₂₂))].  (17)

We finish by reminding that the capacity of the discussed channel is symmetric in the number of transmitted and received modes, i.e., C(m−m_t,m−m_r,m;SNR)=C(m−m_r,m−m_t,m;SNR).

Theorem 3 suggests that the ergodic capacity, for the case of m_(t)>m_(r)>m, is as if the channel was composed of m_(t)+m_(r)−m parallel SISO channels and a single MIMO channel of m−m_(r) transmit modes and m−m_(t) received modes. Note, that an (m−m_(t))×(m−m_(r)) system satisfies (m−m_(r))+(m−m_(t))≦m, therefore its capacity is given by Theorem 2. Thus, the ergodic capacity can be viewed as the sum of min(m_(t)+m_(r)−m,0) SISO channel capacities and a residual MIMO capacity.

4 Non-Ergodic Channel

In this section we assume that the channel is flat fading, i.e. the channel matrix is drawn randomly but is held fixed for a relatively long period of time. The channel is assumed to be known at the receiver but not at the transmitter. In this case, the mutual information between the input and output of the channel is

|(x;y|H ₁₁ =H ₁₁).  (18)

Since H₁₁ is independent of the channel input x and the receiver knows the realization of H₁₁, the mutual information is conditioned on the channel instantiation. However, in contrast to the ergodic channel, the channel realization is held fixed, i.e. the mutual information is assumed to be fixed for the entire transmission period. Since the channel is unknown at the transmitter, there may be a non-zero probability that the transmission rate is not supported by the channel instantiation. This probability is termed outage probability and is given by

P _(out)(m _(t) ,m _(r) ,m;R)=Pr[|(x;y|H ₁₁)<R],  (19)

That is, the probability that the capacity of the channel realization is smaller than the transmission rate R (bits/symbol).

It is well known that a circularly symmetric Gaussian zero-mean input distribution maximizes the mutual information of an AWGN channel, so we get

|(x;y|H ₁₁ =H ₁₁)=log det(I _(m) _(r) +SNR·H ₁₁ QH ₁₁ ^(†)),  (20)

Where Q is the covariance matrix of the transmitted signal and is chosen such that the outage probability is brought to minimum. Hence, we can write

$\begin{matrix} {{P_{out}\left( {m_{t},m_{r},{m;R}} \right)} = {\underset{{{Q_{ii} \leq {1{\forall i}}} = 1},\ldots \mspace{11mu},m_{t}}{\inf\limits_{Q:{Q \pm 0}}}{{\Pr \left\lbrack {{\log \; {\det \left( {I_{m_{r}} + {{{SNR} \cdot H_{11}}{QH}_{11}^{\dagger}}} \right)}} < R} \right\rbrack}.}}} & (21) \end{matrix}$

In the first parts of this section we examine the outage probability for m_(t)+m_(r)≦m and m_(t)+m_(r)>m, where for the second case we show that a strictly zero outage probability is achievable for any transmission rate below (m_(t)+m_(r)−m)log(1+SNR). In the last part we present a new communication scheme that exploits a (delayed) feedback system to transmit at a rate arbitrarily close to (m_(t)+m_(r)−m)log(1+SNR) with zero outage probability.

4.1 The Case of m_(t)+m_(r)≦m

To simplify calculations we take the covariance matrix to be the identity matrix. We note that at high SNR the outage probability satisfies

P _(out)(m _(t) ,m _(r) ,m;R)=Pr[log det(I _(m) _(t) +SNR·H ₁₁ ^(†) H ₁₁)<R],  (22)

Thus, Q=I_(m) _(t) is a reasonable choice for the high SNR regime. Here and throughout the rest of the paper we use ≐ to denote exponential equality, i.e., f (SNR)≐SNR^(d)

$\begin{matrix} {{{denote}\; {\lim\limits_{{SNR}\rightarrow\infty}\frac{\log \mspace{11mu} f\; ({SNR})}{\log \mspace{11mu} {SNR}}}} = {d.}} & (23) \end{matrix}$

Now, by letting the transmission rate be R=r log(1+SNR) bps/Hz

We can write:

$\begin{matrix} {{{P_{out}\left( {m_{t},m_{r},{m;{r\mspace{11mu} {\log \left( {1 + {SNR}} \right)}}}} \right)}B\; {\Pr \left\lbrack {{\log \; {\det\left( {I_{m_{t}} + {{{SNR} \cdot H_{11}^{\dagger}}H_{11}}} \right)}} < {r\mspace{11mu} {\log \left( {1 + {SNR}} \right)}}} \right\rbrack}}\mspace{79mu} = {\Pr\left\lbrack {{\prod\limits_{i = 1}^{\min {\{{m_{t},m_{r}}\}}}\; \left( {1 + {{SNR} \cdot \lambda_{i}}} \right)} < \left( {1 + {SNR}} \right)^{r}} \right\rbrack}} & (24) \end{matrix}$

For any 0≦r≦min{m_(t),m_(r)}, where λ₁≦ . . . ≦λ_(min{m) _(t) _(,m) _(r) _(}) are the ordered non-zeros eigenvalues of H₁₁ ^(†)H₁₁.

We can now use the joint pdf. of the ordered eigenvalues of the relevant Jacobi ensemble to compute (24):

$\begin{matrix} {{{P_{out}\left( {m_{t},m_{r},{m;{r\mspace{11mu} {\log \left( {1 + {SNR}} \right)}}}} \right)}B\; K_{m_{t},m_{r},m}^{- 1}{\int_{B}^{\;}{\prod\limits_{i = 1}^{\min {\{{m_{t},m_{r}}\}}}\; {{\lambda_{i}^{{m_{r} - m_{t}}}\ \left( {1 - \lambda_{i}} \right)}^{m - m_{r} - m_{t}}{\prod\limits_{i < j}\; {\left( {\lambda_{i} - \lambda_{j}} \right)^{2}{\lambda}}}}}}},} & (25) \end{matrix}$

Where K_(m) _(t) _(m) _(r) _(m) is a normalizing factor and

$B = \left\{ {{{\lambda \text{:}0} \leq \lambda_{1} \leq \ldots \leq \lambda_{\min {\{{m_{t},m_{r}}\}}} \leq 1},{{\prod\limits_{i = 1}^{\min {\{{m_{t},m_{r}}\}}}\; \left( {1 + {{SNR} \cdot \lambda_{i}}} \right)} < \left( {1 + {SNR}} \right)^{r}}} \right\}$

is the set that describes the outage event.

This gives an analytical expression to the outage probability.

4.2 The Case of m_(t)+m_(r)>m

Since here the distribution of the singular values of H₁₁ changes, we use Lemma 1 to compute the outage probability:

Theorem 4

Suppose the transmission rate is R=r log(1+SNR)bps/Hz, with 0≦r≦min{m_(t),m_(r)}, where the number of addressed modes satisfies m_(t)+m_(t)>m. For r<m_(t)+m_(r)−m, the outage probability is strictly zero.

For r≧m_(t)+m_(r)−m the outage probability satisfies

P _(out)(m _(t) ,m _(r) ,m;r log(1+SNR))=P _(out)(m−m _(r) ,m−m _(t) ,m;(r−(m _(t) +m _(r) −m))log(1+SNR))  (26)

where the right hand side is given by Equation (25).

Proof. Let

$H = \begin{bmatrix} H_{11} & H_{12} \\ H_{21} & H_{22} \end{bmatrix}$

be the m×m unitary coupling matrix of the channel, where H₁₁ is the m_(r)×m_(t) coupling matrix between the addressed modes. ]By (24) the outage probability is

$\begin{matrix} {{{P_{out}\left( {m_{t},m_{r},{m;{r\; {\log \left( {1 + {SNR}} \right)}}}} \right)} = {\Pr\left\lbrack {{\prod\limits_{i = 1}^{\min {\{{m_{t},m_{r}}\}}}\; \left( {1 + {{SNR} \cdot \lambda_{i}}} \right)} < \left( {1 + {SNR}} \right)^{r}} \right\rbrack}},} & (27) \end{matrix}$

Where λ₁≦ . . . ≦λ_(min{m) _(t) _(,m) _(r) } are the ordered non-zeros eigenvalues of Applying Lemma 1 we get:

$\begin{matrix} {{{P_{out}\left( {m_{t},m_{r},{m;{r\mspace{11mu} {\log \left( {1 + {SNR}} \right)}}}} \right)} = {\Pr\left\lbrack {{\prod\limits_{i = 1}^{m - {\max {\{{m_{t},m_{r}}\}}}}\; \left( {1 + {{SNR} \cdot {\overset{\sim}{\lambda}}_{i}}} \right)} < \left( {1 + {SNR}} \right)^{r - {({m_{t} + m_{r} - m})}}} \right\rbrack}},} & (28) \end{matrix}$

Where λ₁≦ . . . ≦λ_(m−max{m) _(t) _(,m) _(r) _(}) are the ordered non-zeros eigenvalues of H₂₂ ^(†)H₂₂. If r<m_(t)+m_(r)−m, this probability is zero. Otherwise, the right hand is the outage probability in a system with m−m_(t) and m−m_(r) addressed modes at the transmitter and receiver, correspondingly, with a transfer matrix H₂₂.

We finish by reminding that the outage probability is symmetric in the number of transmitted and received modes, i.e., P_(out)(m−m_t,m−m_r,m;R)=P_(out)(m−m_r,m−m_t,m;R).

Note that (m−m_(r))+(m−m_(t))≦m for m_(t)+m_(r)>m, thus the right hand side of (26) is given by (25) and the outage probability is well defined. Thus Theorem 4 states that one can transmit r degrees of freedom, where r≦m_(t)+m_(r)−m, and to achieve outage probability as if only r−(m_(t)+m_(r)−m) degrees of freedom were sent through an (m−m_(t))×(m−m_(r)) MIMO channel. Furthermore, m_(t)+m_(r)−m degrees of freedom can be conveyed to the receiver with strictly zero outage probability. Intuitively, since m_(t)+m_(r)−m singular values of H₁₁ are 1 w.p. 1, there is a non-fading m_(t)+m_(r)−m-dimensional subspace for any realization of H₁₁, where signals can be transmitted over this subspace with zero outage probability.

FIG. 8A depicts the outage probability in a 20 dB SNR for 4×4 unitary channel, for different number of addressed modes at the receiver and transmitter. In FIG. 8B we keep the number of addressed transmitter and receiver modes fixed, presenting the outage curves for different number of supported modes m.

4.3 Achieving Zero Outage Probability

We now present a new communication scheme for m_(r)×m_(t) MIMO channel, where m_(t)+m_(r)>m. According to Lemma 1, m_(t)+m_(r)−m singular values of H₁₁ are 1. The scheme, using simple manipulations, exploits a (delayed) feedback system to complete also the other m−max{m_(t),m_(r)} singular values to 1. Thus the channel is transformed into m_(t)+m_(r)−m independent SISO channels, supporting m_(t)+m_(r)−m streams (degrees of freedom) with zero outage probability. Furthermore, our scheme removes the need for MIMO processing and allows the use of simple SISO channel decoders.

We first describe the principles of the scheme using the following simple example:

Let the transmitter and receiver address 3 out of 4 available modes, i.e., the transform matrix is 3×3 sub-matrix of 4×4 unitary matrix (which is drawn uniformly from the manifold of all 4×4 unitary matrices). According to Theorem 4, two degrees of freedom can be communicated to the receiver with zero outage probability. Now, suppose only for the simplicity of the example that the channel instantiation changes independently at each channel use and let

$H^{(i)} = \begin{bmatrix} H_{11}^{(i)} & H_{12}^{(i)} \\ H_{21}^{(i)} & H_{22}^{(i)} \end{bmatrix}$

be the unitary matrix realization at channel use i. In addition, suppose that at each channel use i the transmitter has perfect knowledge of H₂₁ ^((i−1)), the (m−m_(r))×m_(t) sub-matrix realization of H^((i−1)).

Let the transmitter excite the following three entries vector at each channel use

${i = 1},\ldots \mspace{11mu},{{n\text{:}x^{(i)}} = \begin{bmatrix} x_{1}^{(i)} \\ x_{2}^{(i)} \\ {H_{21}^{({i - 1})}x^{({i - 1})}} \end{bmatrix}},$

Where we define x⁽⁰⁾ to be a vector of zeros. Hence, in each channel use the transmitter communicates two new information bearing symbols and a linear combination of the previous signal.

The received signal at each channel use i=1, . . . , n is y^((i))=√{square root over (SNR)}H₁₁ ^((i))x^((i))+z^((i)).

Now, since H₁₁ ^((i)) is assumed to be known at the receiver, H₂₁ ^((i)) can be also computed using the orthonormality of H^((i))'s columns. We further assume that the receiver has as a side information the following noisy measure of x^((n)) y_(si) ^((n))=√{square root over (SNR)}H₂₁ ^((n))x^((n))+z_(si) ^((n)),

Where z_(si) ^((n)):CN(0,1) is independent of z^((n)). Thus, the receiver can construct the following vector

$y^{(n)} = {\begin{bmatrix} H_{11}^{{(n)}\dagger} & H_{21}^{{(n)}\dagger} \end{bmatrix} \cdot \begin{bmatrix} y^{(n)} \\ y_{si}^{(n)} \end{bmatrix}}$

Which satisfies

${y^{(n)} = {\begin{bmatrix} y_{1}^{(n)} \\ y_{2}^{(n)} \\ y_{3}^{(n)} \end{bmatrix} = {{{\sqrt{SNR}x^{(n)}} + z^{(n)}} = {{\sqrt{SNR}\begin{bmatrix} x_{1}^{(n)} \\ x_{2}^{(n)} \\ {H_{21}^{({n - 1})}x^{({n - 1})}} \end{bmatrix}} + z^{(n)}}}}},$

Where z^((n)) has i.i.d. CN(0,1) entries. Letting y₃ ^((n)) be y_(si) ^((n−1)), the side information for channel use n−1 and repeating this procedure for i=n−1 to 1 result two streams of measures

$\begin{bmatrix} y_{1}^{(1)} \\ y_{2}^{(1)} \end{bmatrix},\begin{bmatrix} y_{1}^{(2)} \\ y_{2}^{(2)} \end{bmatrix},\ldots \mspace{11mu},{\begin{bmatrix} y_{1}^{(n)} \\ y_{2}^{(n)} \end{bmatrix}.}$

Thus, we get two information streams that as if were communicated through two independent AWGN SISO channels, each with signal-to-noise ratio SNR (and therefore with zero outage probability).

Note that the scheme is feasible if the side information after channel use n is being conveyed by the transmitter through a neglectable number of channel uses (with respect to n) and if a feedback system is employed to communicate to the transmitter after each channel use i.

We now formalize the scheme for any m_(r)×m_(t) Jacobi channel that satisfies m_(t)+m_(r)>m.

Let

$H^{(i)} = \begin{bmatrix} H_{11}^{(i)} & H_{12}^{(i)} \\ H_{21}^{(i)} & H_{22}^{(i)} \end{bmatrix}$

be the unitary matrix realization at channel use i. We assume perfect knowledge of H₁₁ ^((i)) at the receiver and a noiseless feedback communication with a delay of k channel uses. Since H^((i)) unitary, H₂₁ ^((i)) can be computed from H₁₁ ^((i)) and we assume that the receiver noiselessly communicates H₂₁ ^((i)) to the transmitter. Note that H₂₁ ^((i)) completes H₁₁ ^((i))'s columns into orthonormal columns, thus for m_(t)+m_(r)−m>1 and certain matrix instantiations, the computed is not unique and can be chosen wisely (see Remark 4).

Now, the transmitter excites the following signal from the addressed modes at each channel use i=1, . . . , nk:

${x^{(i)} = \begin{bmatrix} x_{1}^{(i)} \\ \vdots \\ x_{m_{t} + m_{r} - m}^{(i)} \\ {H_{21}^{({i - k})}x^{({i - k})}} \end{bmatrix}},$

Where x^((i)), for i=−(k−1), . . . , 0, is a vector of zeros. That is, the transmitter conveys m_(t)+m_(r)−m new information bearing symbols and H₂₁ ^((i−k))x^((i−k)), a linear combination of the signal that was transmitted k channel uses before. Note that since H unitary, the power constraint is left satisfied.

Now, assume the transmitter communicates to the receiver the following measures y_(si) ^((i))=√{square root over (SNR)}H₂₁ ^((i))x^((i))+z_(si) ^((i)) ∀i=(n−1)k+1, . . . , nk,

That is, noisy measures of the last k transmitted signals, where z_(si) ^((i)) are independent with i.i.d. CN(0,1) entries. As was shown above, the receiver can use the side information to get

$\begin{bmatrix} y_{1}^{(i)} \\ \vdots \\ y_{m_{t}}^{(i)} \end{bmatrix} = {{{\sqrt{SNR}x^{(i)}} + z^{(i)}} = {{\sqrt{SNR}\begin{bmatrix} x_{1}^{(i)} \\ \vdots \\ x_{m_{t} + m_{r} - m}^{(i)} \\ {H_{21}^{({i - k})}x^{({i - k})}} \end{bmatrix}} + {z^{(i)}.}}}$

For all i=(n−1)k+1, . . . , nk, where z^((i)) are independent with i.i.d CN(0,1) entries.

Letting

$\begin{bmatrix} y_{m_{t} + m_{r} - m + 1}^{(i)} \\ \vdots \\ y_{m_{t}}^{(i)} \end{bmatrix} = {{\sqrt{SNR}H_{21}^{({i - k})}x^{({i - k})}} + \begin{bmatrix} z_{m_{t} + m_{r} - m + 1}^{(i)} \\ \vdots \\ z_{m_{t}}^{(i)} \end{bmatrix}}$

Be the side information y_(si) ^((i−k)) measures for channel use i−k, for all i=(n−1)k+1, . . . , nk, and repeating this procedure for i=(n−1)k till i=1 results in m_(t)+m_(r)−m independent streams of measures

$\begin{bmatrix} y_{1}^{(1)} \\ \vdots \\ y_{m_{t} + m_{r} - m}^{(1)} \end{bmatrix},\ldots \mspace{11mu},{\begin{bmatrix} y_{1}^{({nk})} \\ \vdots \\ y_{m_{t} + m_{r} - m}^{({nk})} \end{bmatrix}.}$

Thus, having noisy measures of the last k symbols, y_(si) ^(((n−1)k+1)), . . . , y_(si) ^((nk)), the receiver can construct m_(t)+m_(r)−m SISO channels, each with a signal-to-noise ratio SNR. Assuming the transmitter can convey these measures using a neglectable number of channel uses (with respect to n, see Remark 3), the scheme allows approaching the rate (m_(t)+m_(r)−m)log(1+SNR) with zero outage probability.

Remark 1 (Delayed Feedback)

The scheme exploits a noiseless feedback system to communicate a (possibly) outdated information—the channel realization in previous channel uses. Thus, the feedback is not required to be fast, that is, no limitations on the delay time k. However, for non-ergodic systems with a short delay time, the feedback may carry information about the current channel realization. Thus, the transmitter can exploit the up-to-date feedback to use more efficient schemes (e.g., water filling). Nevertheless, for systems with a long delay time (e.g., relatively long distance optical fibers), the channel can be regarded as non-ergodic however with an outdated feedback. In these cases our scheme efficiently achieves zero outage probability.

Remark 2 (Simple Decoding)

The scheme constructs m_(t)+m_(r)−m independent streams of measures, each with signal-to-noise SNR. This allows the decoding stage to be simple, where a SISO channel decoder can be used, removing the need for further MIMO signal processing.

Remark 3 (Side Information Measures)

Given noisy measures of the last k transmitted signals, y_(si) ^((i))=√{square root over (SNR)}H₂₁ ^((i))x^((i))+z_(si) ^((i)), ∀i=nk−(k−1), . . . , nk,

Where z_(si) ^((i)) are independent with i.i.d. CN(0,1) entries, the scheme can construct m_(t)+m_(r)−m independent streams of measures. Thus, the transmitter has to convey H₂₁ ^((i))x^((i)), for each i=nk−(k−1), . . . , nk, such that the receiver can extracted a vector of noisy measures with signal-to-noise ratio that is not smaller than SNR. This is feasible with a finite number of channel uses. For example, the repetition scheme can be used to convey these measures, each with a signal-to-noise ratio that is at least SNR (according to Lemma 1, see Section 5 Example 2). Suppose each H₂₁ ^((i))x^((i)) is conveyed to the receiver within N_(si) channel uses (e.g., for the repetition scheme N_(si)=m_(t)(m−m_(r))). By taking large enough n (with respect to N_(si)) one can approach the rate (m_(t)+m_(r)−m)log(1+SNR). The repetition scheme can convey the m−m_(r) entries of H₂₁ ^((i))x^((i)) with N_(si)=m_(t)(m−m_(r)) channel uses. In each channel use a single entry is transmitted through a single mode (while all other modes are zero) in a way that all entries are transmitted through all modes.

Remark 4 (The Uniqueness of $_(—)21$)

The scheme can be further improved to support even an higher data rate with outage probability zero. For example, the last m−m_(r) entries of the transmitted signal at the first k channel uses can be used to excite information bearing symbols instead of the zeros symbols. Furthermore, as was mentioned above, when m_(t)+m_(r)−m>1, H₂₁ ^((i)) is not unique; there are many (m−m_(r))×m_(t) matrices that complete the columns of H₁₁ ^((i)) into orthonormal columns. Thus, the receiver can choose H₂₁ ^((i)) to be the one with the largest number of zeros rows. Now, at time i+k the transmitter excites m_(t)+m_(r)−m new symbols and H₂₁ ^((i))x^((i)), a retransmission of x^((i)), the transmitted signal at time i. With an appropriate choice of H₂₁ ^((i)), H₂₁ ^((i))x^((i)) contains entries that are zero. Instead, these entries can contain additional new information bearing symbols.

It seems that a further enhancement of the data rate can be achieved by exploiting the feedback to approach the empirical capacity for any realization of H₁₁. Note that this rate is achievable with an up-to-date feedback. The enhancement of the data rate can be enhanced with an outdated feedback system (and with zero outage probability).

There can be provided a system that utilizes this scheme, a method that utilizes this scheme and a computer readable medium that is not transitory and stores instructions that once executed utilize the scheme.

5 Diversity Multiplexing Tradeoff

In this section we want to analyze the tradeoff between diversity and multiplexing in the Jacobi channel. We first examine the error probability of two simple examples—an uncoded transmission in an 1×m_(r) system and a repetition scheme in an m_(t)×m_(r) system. Note that the latter can be viewed as a generalization of the first.

Example 1 $m_t=1$

Consider a transmission of an uncoded signal through a single mode. When all available modes are being addressed at the receiver, that is m_(r)=m, the entire signal power is extracted and the channel corresponds to a SISO unfading channel with an exponentially decaying (with SNR) error probability. However, for m_(r)<m some power is lost in the unaddressed modes resulting an higher error probability. Suppose the transmitter excites an uncoded QPSK signal (Similar results can be obtained for higher constellations). Using the sphere bound we can upper bound the error probability for a given channel realization:

Pr(error|H ₁₁ =H ₁₁)≦Pr(|z| ²>SNR/2∥H ₁₁∥_(F) ²)  (29)

Where z:CN(0,1) and ∥H₁₁∥_(F) is the Frobenius norm of a matrix: ∥A∥_(F) ²=Σ_(ij)|A_(ij)|²=Σ_(i)λ_(i), where λ_(i) are the singular values of A. Since for high SNR this bound is tight, we can write

$\begin{matrix} {{{\Pr \left( {{{error}H_{11}} = H_{11}} \right)} \doteq {\exp\left( {{- \frac{SNR}{2}}{H_{11}}_{F}^{2}} \right)}},} & (30) \end{matrix}$

where we further applied the cdf. of a chi-squared random variable with 2 degrees of freedom. Thus, by letting λ be the square singular value of H₁₁, we can write:

$\begin{matrix} {{\Pr \left( {{{error}{H_{11}}_{F}^{2}} = \lambda} \right)} \doteq {{\exp\left( {{- \frac{SNR}{2}}\lambda} \right)}.}} & (31) \end{matrix}$

By taking the expectation over (31) w.r.t λ we get the error probability

$\begin{matrix} {{P_{e}({SNR})} \doteq {{E\left\lbrack ^{{- \frac{SNR}{2}}\lambda} \right\rbrack}.}} & (32) \end{matrix}$

Now, for m_(r)=m we always have λ=1, thus the error probability satisfies

$\begin{matrix} {{P_{e}({SNR})} \doteq {^{- \frac{SNR}{2}}.}} & (33) \end{matrix}$

For m_(r)<m, we can use (4), the pdf. of the eigenvalue of a Jacobi matrix J(m_(r),m−m_(r),1), to calculate the right hand side of (32):

$\begin{matrix} {{E\left\lbrack ^{{- \frac{SNR}{2}}\lambda} \right\rbrack} = {\int_{0}^{1}{\Pr\left( {\lambda \; {re}^{{- \frac{SNR}{2}}\lambda}{\lambda}} \right.}}} & {\mspace{85mu} (34)} \\ {= {K_{1,m_{r},m}^{- 1}{\int_{0}^{1}{{\lambda^{m_{r} - 1}\left( {1 - \lambda} \right)}^{m - m_{r} - 1}^{{- \frac{SNR}{2}}\lambda}\ {{\lambda}.}}}}} & { (35)} \end{matrix}$

We use the Taylor expansion of (1−x)^(a) to have

$\begin{matrix} {{E\left\lbrack ^{{- \frac{SNR}{2}}\lambda} \right\rbrack} = {K_{1,m_{r},m}^{- 1}{\sum\limits_{i = 0}^{m - m_{r} - 1}\; {\left( {- 1} \right)^{i}\begin{pmatrix} {m - m_{r} - 1} \\ i \end{pmatrix}{\int_{0}^{1}{\lambda^{m_{r} + i - 1}^{{- \frac{SNR}{2}}\lambda}\ {\lambda}}}}}}} & {(36)} \\ {= {K_{1,m_{r},m}^{- 1}{\sum\limits_{i = 0}^{m - m_{r} - 1}{\left( {- 1} \right)^{i}\begin{pmatrix} {m - m_{r} - 1} \\ i \end{pmatrix}}}}} & {(37)} \\ {\left\lbrack {{{\left( {m_{r} + i - 1} \right)!}\left( \frac{SNR}{2} \right)^{- {({m_{r} + i})}}} -} \right.} & \\ {\left. {^{- \frac{SNR}{2}}{\sum\limits_{j = 0}^{m_{r} + i - 1}{\begin{pmatrix} {m_{r} + i - 1} \\ j \end{pmatrix}{j!}\left( \frac{SNR}{2} \right)^{- {({j + 1})}}}}} \right\rbrack.} &  \end{matrix}$

In high SNR (37) is dominated by the term K_(i,m) _(r) _(,m) ⁻¹(m_(r)−1)!(SNR/2)^(−m) ^(r) and since the bound is tight at this regime we can write

$\begin{matrix} {{P_{e}({SNR})} \doteq \left\{ \begin{matrix} {{SNR}^{- m_{r}},} & {m_{r} \neq m} \\ {^{- \frac{SNR}{2}},} & {m_{r} = {m.}} \end{matrix} \right.} & (38) \end{matrix}$

Thus, the number of received modes dictates the decaying order of the error probability at high SNR. Having in mind that the m_(r)×1 channel matrix can be viewed as a sub-vector of m×1 vector that was constructed by normalizing m i.i.d. complex Gaussian rv's, it is not surprising that the error probability in the analogue Rayleigh channel has a similar behavior at high SNR. But is this true also for m_(t)≠1? To that end we want to examine the error probability of the repetition scheme in an m_(r)×m_(t) system.

Example 2 Repetition Scheme

Suppose the transmitter excites the following (m_(t) entries) signals in each m_(t) consecutive channel uses:

a.

$\begin{bmatrix} x \\ 0 \\ 0 \\ \vdots \\ 0 \end{bmatrix},\begin{bmatrix} 0 \\ x \\ 0 \\ \vdots \\ 0 \end{bmatrix},\ldots \mspace{14mu},\begin{bmatrix} 0 \\ 0 \\ \vdots \\ 0 \\ x \end{bmatrix}$

Where x is an uncoded QPSK symbol (Similar results can be obtained also for higher constellations). Let us assume that the channel is constant within the m_(t) channel uses and, for simplicity of notations, we further assume m_(r)≧m_(t) (similar results can be obtained for m_(t)>m_(r)). Thus, with the same considerations as before, the error probability satisfies

$\begin{matrix} {{{P_{e}({SNR})} \doteq \left\lbrack {\exp\left( {{- \frac{SNR}{2}}{\sum\limits_{i = 1}^{m_{t}}\lambda_{i}}} \right)} \right\rbrack},} & (39) \end{matrix}$

Where the expectation is over λ₁≦ . . . ≦λ_(m) _(t) , the ordered non-zero eigenvalues of H₁₁ ^(†)H₁₁.

Now, using the joint pdf. of the ordered eigenvalues of a Jacobi matrix J(m_(r),m−m_(r),m_(t)) we can analyze (39) for m_(t)+m_(r)≦m:

$\begin{matrix} {{E\left\lbrack {\exp\left( {{- \frac{SNR}{2}}{\sum\limits_{i = 1}^{m_{t}}\lambda_{i}}} \right)} \right\rbrack} = {\int\mspace{14mu} {\ldots \mspace{14mu} {\int{{\Pr \left( {\lambda_{1},\ldots \mspace{14mu},\lambda_{m_{t}}} \right)}^{{- \frac{SNR}{2}}{\sum\limits_{i = 1}^{m_{t}}\lambda_{i}}}\ {\lambda_{1}}\mspace{14mu} \ldots {\lambda_{m_{t}}}}}}}} & {(40)} \\ {{= {K_{m_{t},m_{r},m}^{- 1}{\int_{0}^{1}{\int_{\lambda_{1}}^{1}\mspace{14mu} {\ldots \mspace{14mu} {\int_{\lambda_{m_{t}}}^{1}{\prod\limits_{i = 1}^{m_{t}}\; {\lambda_{i}^{m_{r} - m_{t}}\left( {1 - \lambda_{i}} \right)}^{m - {({m_{r} + m_{t}})}}}}}}}}}\ } & {(41)} \\ {{\prod\limits_{i < j}\; {\left( {\lambda_{j} - \lambda_{i}} \right)^{2}^{{- \frac{SNR}{2}}\lambda_{i}}{\lambda_{1}}\mspace{14mu} \ldots  {\lambda_{m_{t}}}}}} & \\ {{= {\frac{K_{m_{t},m_{r},m}^{- 1}}{m_{t}!}{\int_{0}^{1}\mspace{14mu} {\ldots \mspace{14mu} {\int_{0}^{1}{\prod\limits_{i = 1}^{m_{t}}\; {\lambda_{i}^{m_{r} - m_{t}}\left( {1 - \lambda_{i}} \right)}^{m - {({m_{r} + m_{t}})}}}}}}}}\ } & {(42)} \\ {{{\prod\limits_{i < j}\; {\left( {\lambda_{j} - \lambda_{i}} \right)^{2}^{{- \frac{SNR}{2}}\lambda_{i}}{\lambda_{1}}\mspace{14mu} \ldots  {\lambda_{m_{t}}}}},}} &  \end{matrix}$

Where in the last equation we used the joint pdf. of the un-ordered eigenvalues. Note that the term

$\prod\limits_{1 \leq i < j \leq m_{t}}\; \left( {\lambda_{j} - \lambda_{i}} \right)$

is the determinant of the Vandermonde matrix a.

$\quad{\begin{bmatrix} 1 & \ldots & 1 \\ \lambda_{1} & \ldots & \lambda_{m_{t}} \\ \vdots & \; & \vdots \\ \lambda_{1}^{m_{t} - 1} & \ldots & \lambda_{m_{t}}^{m_{t} - 1} \end{bmatrix}.}$

Thus we can write

$\begin{matrix} {{{\prod\limits_{1 \leq i < j \leq m_{t}}\; \left( {\lambda_{j} - \lambda_{i}} \right)^{2}} = {\sum\limits_{\sigma_{1},{\sigma_{2} \in S_{m_{t}}}}\; {\left( {- 1} \right)^{{{sgn}{(\sigma_{1})}} + {{sgn}{(\sigma_{2})}}}{\prod\limits_{i = 1}^{m_{t}}\; \lambda_{i}^{{\sigma_{1}{(i)}} + {\sigma_{2}{(i)}} - 2}}}}},} & (43) \end{matrix}$

Where S_(m) _(t) is the set of permutations of 1, . . . , m_(t) and sgn(σ) denotes the signature of the permutation σ. Applying (43) into (42) results

$\begin{matrix} {{E\left\lbrack {\exp\left( {{- \frac{SNR}{2}}{\sum\limits_{i = 1}^{m_{t}}\lambda_{i}}} \right)} \right\rbrack} = {\frac{K_{m_{t},m_{r},m}^{- 1}}{m_{t}!}{\sum\limits_{\sigma_{1},{\sigma_{2} \in S_{m_{t}}}}\; {\left( {- 1} \right)^{{{sgn}{(\sigma_{1})}} + {{sgn}{(\sigma_{2})}}}{\prod\limits_{i = 1}^{m_{t}}{\int_{0}^{1}\; {{\lambda_{i}^{m_{r} - m_{t} + {\sigma_{1}{(i)}} + {\sigma_{2}{(i)}} - 2}\left( {1 - \lambda_{i}} \right)}^{m - {({m_{r} + m_{t}})}}^{{- \frac{SNR}{2}}\lambda_{i}}{{\lambda_{1}}.}}}}}}}} & (44) \end{matrix}$

With the same techniques used before (to get equation (37)) we get that the error probability at high SNR is dominated by the term

$\begin{matrix} {\frac{K_{m_{t},m_{r},m}^{- 1}}{m_{t}!}{\sum\limits_{\sigma_{1},{\sigma_{2} \in S_{m_{t}}}}\; {\left( {- 1} \right)^{{{sgn}{(\sigma_{1})}} + {{sgn}{(\sigma_{2})}}}{\prod\limits_{i = 1}^{m_{t}}{{\left( {m_{r} - m_{t} + {\sigma_{1}(i)} + {\sigma_{2}(i)} - 2} \right)!}{\left( \frac{SNR}{2} \right)^{- {({m_{r} - m_{t} + {\sigma_{1}{(i)}} + {\sigma_{2}{(i)}} - 1})}}.}}}}}} & (45) \end{matrix}$

Thus, for m_(t)+m_(r)≦m, the error probability satisfies

$\begin{matrix} {{P_{e}({SNR})} \doteq {C_{m_{t},m_{r},m}{SNR}^{- {\sum\limits_{i = 1}^{m_{t}}\; {({m_{r} - m_{t} + {2i} - 1})}}}}} & {(46)} \\ {{\doteq {SNR}^{{- m_{r}} \cdot m_{t}}},} & {{~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~}(47)} \end{matrix}$

Where it can be shown that C_(m) _(t) _(,m) _(r) _(,m) is a non-zero constant.

For m_(t)+m_(r)>m, by applying Lemma 1 into (39) we get

${{P_{e}({SNR})} \doteq {^{- \frac{{SNR}{({m_{t} + m_{r} - m})}}{2}} \cdot {E\left\lbrack {\exp\left( {{- \frac{SNR}{2}}{\sum\limits_{i = 1}^{m - m_{r}}\; {\overset{\sim}{\lambda}}_{i}}} \right)} \right\rbrack}}},$

Where {tilde over (λ)}₁≦ . . . ≦{tilde over (λ)}_(m−m) _(r) are the non-zero ordered eigenvalues of H₂₂ ^(†)H₂₂. Since H₂₂ applies to the first case, that is, (m−m_(t))+(m−m_(r))<m, we can use (47) and conclude that the error probability of the repetition scheme satisfies

$\begin{matrix} {{P_{e}({SNR})} \doteq \left\{ \begin{matrix} {{SNR}^{{- m_{r}} \cdot m_{t}},} & {{m_{t} + m_{r}} \leq m} \\ {{^{- \frac{{SNR}{({m_{t} + m_{r} - m})}}{2}} \cdot {SNR}^{{- {({m - m_{t}})}} \cdot {({m - m_{r}})}}},} & {{m_{t} + m_{r}} > {m.}} \end{matrix} \right.} & (48) \end{matrix}$

Equation (48) implies that, for m_(t)+m_(r)≦m, the exponent of the dominant term in the error probability is m_(r)·m_(t) (which is similar to the Rayleigh channel, see further discussion in Section 6). Thus the performance gain of an m_(t)×m_(r) system compared to a system with a single transmit and receive mode is dictated by the SNR exponent of the error probability. This SNR exponent is termed the diversity gain. Intuitively, the total transmitted power is spread over all m available modes, thus addressing only some modes results in a power loss. As the number of addressed modes at the receiver is higher and as the transmitter excites more modes, the probability for a substantial power loss is smaller. Analogously, in wireless systems, as the signal passes through more (independent) paths, the probability for a fading is smaller. However, in the Jacobi channel it turns out that there is a transition threshold in which enough modes are being addressed to ensure a certain received power. This results an exponentially decaying error probability for appropriate rates.

Now, in Section 3 we have analyzed the ergodic capacity and showed that as more modes are being addressed, the capacity increases. Thus, increasing the number of addressed modes has another potential gain—higher data rate. This gain is termed spatial multiplexing gain. In MIMO systems there is a fundamental tradeoff between the diversity and multiplexing gains. The optimal tradeoff for the Rayleigh channel was presented. We now turn to analyze this tradeoff in the Jacobi channel. To that end, we formalize the concepts of diversity gain and multiplexing gain by quoting some definitions:

Definition 4

Let a scheme be a family of codes {C(SNR)} of block length l, one at each SNR level. Let R(SNR) (bits/symbols) be the rate of the code C(SNR). A scheme {C(SNR)} is said to achieve spatial multiplexing gain r and diversity gain d if the data rate satisfies

${\lim\limits_{{SNR}->\infty}\frac{R({SNR})}{\log \mspace{11mu} {SNR}}} = r$

End the average error probability satisfies

${\lim\limits_{{SNR}->\infty}\frac{P_{e}({SNR})}{\log \; {SNR}}} = {- {d.}}$

For each r, define d*(r) to be the supremum of the diversity advantage achieved over all schemes.

For example, let us discuss the uncoded repetition scheme. For m_(t)+m_(r)≦m, the diversity gain is m_(r)·m_(t) when transmitting a signal from a fixed constellation. E.g., for QPSK modulation the data rate is fixed, R(SNR)=1/m_(t) (bps/Hz) for any SNR. Thus, for diversity gain of m_(r)·m_(t) the scheme achieves a multiplexing gain of 0. By increasing the constellation size with SNR to achieve an higher multiplexing gain, i.e., to support a data rate of R(SNR)=r log SNR (bps/Hz) (for some 0<r<l/m_(t)), the minimum distance between the constellation points decreases with SNR. This results in an error probability with a smaller decaying order, that is, a lower diversity gain. See further discussion in Example 3.

We next find the optimal tradeoff, d*(r), for the Jacobi channel.

5.1 The Case of m_(t)+m_(r)≦m

Theorem 5

Let the block length satisfy l≧m_(t)+m_(r)−1. The optimal diversity-multiplexing tradeoff curve d*(r) for m_(t)+m_(r)≦m, is given by the piecewise linear function that connects the points (k, d*(k)) for k=0, 1, . . . , min{m_(t),m_(r)}, where

d*(k)=(m _(t) −k)(m _(r) −k).  (49)

Proof. See Appendix 8.

Hence, for m_(t)+m_(r)≦m, the optimal tradeoff curve is equivalent to the optimal curve in the Rayleigh channel. We note that for l<m_(t)+m_(r)−1, bounds on d*(r) can be obtained using results from Appendix 8.

5.2 The Case of m_(t)+m_(r)>m

According to Theorem 4 we can write

P _(out)(m _(t) ,m _(r) ,m;r log(1+SNR))=P _(out)(m−m _(r) ,m−m _(t) ,m;(r−(m _(t) +m _(r) −m))log(1+SNR))  (50)

for r+m_(t)+m_(r)−m.

Thus, for rates above (m_(t)+m_(r)−m)log(1+SNR), the optimal diversity-multiplexing tradeoff can be found from Theorem 5. Theorem 4 further states that the outage probability for rates below (m_(t)+m_(r)−m) log(1+SNR) is strictly zero. Hence, for multiplexing gains below m_(t)+m_(r)−m there is a scheme that can convey unfading signals to the receiver, thereby achieving an exponentially decaying error probability. In this case the discussion about diversity is no longer relevant. Nonetheless, one can think of the gain as infinite. This reveals an interesting difference between the Jacobi and Rayleigh channels—the maximum diversity gain is “unbounded” vs. m_(r)·m_(t).

The following Theorem states the above.

Theorem 6

The optimal diversity-multiplexing tradeoff curve d*(r), for m_(t)+m_(r)>m, is given by

$\begin{matrix} {{d^{*}(r)} = \left\{ \begin{matrix} {{d_{risdual}^{*}\left( {r - \left( {m_{t} + m_{r} - m} \right)} \right)},} & {r \geq {m_{t} + m_{r} - m}} \\ {\infty,} & {r < {m_{t} + m_{r} - m}} \end{matrix} \right.} & (51) \end{matrix}$

Where d*_(risdual)(r) is the optimal curve of an (m−m_(r))×(m−m_(t)) system. For a block length l≧m_(t)+m_(r)−1, the optimal curve d*_(risdual)(r) is the piecewise linear function that connects the points (k,d*_(risdual)(k)) for k=0, 1, . . . , min{m−m_(r),m−m_(t)} where

d* _(risdual)(k)=(m−m _(r) −k)(m−m _(t) −k).  (52)

Proof. Immediate from Theorem 4 since the error probability at high SNR is dominated by the outage probability, see Appendix 8.

In the following example we try to illuminate the concept of infinite diversity gain.

Example 3 $m_t=m_r=2$

We consider the 2×2 Alamouti scheme. Assuming a code block of length l≧3 and rate R=r log SNR (bps/Hz), the transmitter excites in each two consecutive channel uses two information bearing symbols in the following manner

$\begin{bmatrix} x_{1} \\ x_{2} \end{bmatrix},{\begin{bmatrix} {- x_{2}^{\dagger}} \\ x_{1}^{\dagger} \end{bmatrix}.}$

ML decoding linearly combines the received measures and yields the following equivalent scalar channels:

y _(i)=√{square root over (∥H ₁₁∥_(F) ²SNR)}·x _(i) +z _(i) , ∀i=1, 2  (53)

Where each z, is distributed CN(0,1) independent of x_(i) and H₁₁. The probability for an outage event is given by

P _(out)(2,2,m;R)=Pr(log(1+∥H ₁₁∥_(F) ²SNR)<r log SNR)  (54)

≐Pr(∥H ₁₁∥_(F) ²<SNR^(−(l−r)) ⁺ ).  (55)

Now, for the Rayleigh fading channel ∥H₁₁∥_(F) ² is chi-square distributed with 2m_(t)m_(r) degrees of freedom. In this case, the 2×2 Alamouti scheme can achieve maximum diversity gain of 4.

However, in the Jacobi channel: for m=2 we have ∥H₁₁∥_(F) ²=2 (H₁₁=H unitary) and for m=3 we have ∥H₁₁∥_(F) ²≧1 (by Lemma 1 and the fact that the Frobenius norm equals the sum of the eigenvalues of H₁₁ ^(†)H₁₁). For m≧4 there is always a non-zero probability for an outage event.

Therefore, for m=2 and m=3, for any r we get equivalent unfading scalar channels with strictly zero outage probability and one can think of the maximum diversity gain as infinite.

For m≧4 it can be shown that the maximum diversity gain is 4 and the optimal tradeoff curve linearly connects the points (1,0) and (0,4).

Discussion

Edelman and Sutton Now, let G₁ and G₂ be m_(r)×m_(t) and (m−m_(r))×m_(t) independent random matrices, each with i.i.d complex standard Gaussian entries. it was shown that the squared generalized singular values of G₁ and G₂ follow the law of the Jacobi ensemble J(m_(r),m−m_(r),m_(t)). Furthermore, since the Jacobi ensemble J(m_(r),m−m_(r),m_(t)) can be constructed by G₁ ^(†)G₁(G₁ ^(†)G₁+G₂ ^(†)G₂)⁻¹, we can roughly say that in terms of singular values, the channel H₁₁ can be viewed as m_(r)×m_(t) sub-matrix of a normalized m×m_(t) Gaussian matrix, or as a sub-channel of a normalized Gaussian channel.

Thus, at high SNR, for m_(t)+m_(r)>m the error probability of the repetition scheme turns exponentially decaying with SNR. For m_(t)+m_(r)≦m, the exponent of the dominant term in the error probability, m_(r)·m_(t), is as in the analogue Rayleigh channel. Note that it does not depend on m, despite the fact that as m is larger, the columns of the channel are more independent, the decaying order of the error probability, m_(r)·m_(t), does not depend on the total number of available modes, m, i.e., does not account for the magnitude of the orthogonality of the channel columns. In fact, at high SNR, the decaying order of the error probability behaves as in wireless systems, where the columns are Gaussian i.i.d vectors. Intuitively, at high SNR and for m_(t)+m_(r)≦m, the error probability is dominated by the event of outage where the impact of the orthonormality of the channel columns is negligible. This is not always true for m_(t)+m_(r)>m

7 Proof of Theorem 2

By Theorem 1, the ergodic capacity satisfies

$\begin{matrix} {{C\left( {m_{t},m_{r},{m;{SNR}}} \right)} = {E\left\lbrack {\log \; {\det \left( {I_{m_{t}} + {{{SNR} \cdot H_{11}^{\dagger}}H_{11}}} \right)}} \right\rbrack}} & {(56)} \\ {= {E\left\lbrack {\sum\limits_{i = 1}^{m_{t}}{\log \left( {1 + {{SNR} \cdot \lambda_{i}}} \right)}} \right\rbrack}} & {(57)} \end{matrix}$

Where we denote by λ={λ₁, . . . , λ_(m) _(t) } the non-zeros eigenvalues of H₁₁ ^(†)H₁₁. Note that for simplicity of notations we assume m_(r)≧m_(t) (all results hold true also for m_(t)>m_(r) by switching m_(t) and m_(r)). Thus, we can write the ergodic capacity as the expectation over only one of the unordered eigenvalues:

C(m _(t) ,m _(r) ,m;SNR)=E[log(1+SNR·λ₁)].  (58)

Now, the joint pdf. of the ordered eigenvalues, f_(λ)(λ₁, . . . , λ_(m) _(t) ), is given by (4). The joint pdf. of the unordered eigenvalues equals

${\frac{1}{m_{t}!}{f_{\lambda}\left( {\lambda_{1},\ldots \mspace{14mu},\lambda_{m_{t}}} \right)}},$

Thus we can compute the density of λ₁ by integrating out λ₂, . . . , λ_(m) _(t) :

$\begin{matrix} {{f_{\lambda_{1}}\left( \lambda_{1} \right)} = {\int_{0}^{1}\mspace{14mu} {\ldots \mspace{14mu} {\int_{0}^{1}{\frac{1}{m_{t}!}{f_{\lambda}\left( {\lambda_{1},\ldots \mspace{14mu},\lambda_{m_{t}}} \right)}{\lambda_{2}}\mspace{14mu} \ldots \mspace{14mu} {{\lambda_{m_{t}}}.}}}}}} & (59) \end{matrix}$

By taking

$\begin{matrix} {\lambda_{i} = {\frac{1}{2}\left( {1 - {\overset{\sim}{\lambda}}_{i}} \right)}} & (60) \end{matrix}$

We can write

f _({tilde over (λ)}) ₁ ({tilde over (λ)}₁)=∫⁻¹ ¹ . . . ∫⁻¹ ¹ f _({tilde over (λ)})({tilde over (λ)}₁, . . . , {tilde over (λ)}_(m) _(t) )d{tilde over (λ)} ₂ . . . d{tilde over (λ)} _(m) _(t) ,  (61)

Where

$\begin{matrix} {{{f_{\overset{\sim}{\lambda}}\left( {{\overset{\sim}{\lambda}}_{1},\ldots \mspace{14mu},{\overset{\sim}{\lambda}}_{m_{t}}} \right)} = {\frac{K_{m_{t},m_{r},m}^{- 1}}{2_{m_{t}!}^{m_{t}{({m - m_{t}})}}}{\prod\limits_{i = 1}^{m_{t}}{\left( {1 - {\overset{\sim}{\lambda}}_{i}} \right)^{\alpha}\left( {1 + {\overset{\sim}{\lambda}}_{i}} \right)^{\beta}{\prod\limits_{i < j}\left( {{\overset{\sim}{\lambda}}_{i} - {\overset{\sim}{\lambda}}_{j}} \right)^{2}}}}}},} & (62) \end{matrix}$

and we denote α=m_(r)−m_(t) and β=m_(r)−m_(t).

Now, the term

$\prod\limits_{1 \leq i < j \leq m_{t}}\left( {{\overset{\sim}{\lambda}}_{i} - {\overset{\sim}{\lambda}}_{j}} \right)$

is the determinant of the Vandermonde matrix a.

$\begin{matrix} {\begin{bmatrix} 1 & \ldots & 1 \\ {\overset{\sim}{\lambda}}_{1} & \ldots & {\overset{\sim}{\lambda}}_{m_{t}} \\ \vdots & \; & \vdots \\ {\overset{\sim}{\lambda}}_{1}^{m_{t} - 1} & \ldots & {\overset{\sim}{\lambda}}_{m_{t}}^{m_{t} - 1} \end{bmatrix}.} & (63) \end{matrix}$

With row operations we can transform (63) into the following matrix

$\begin{matrix} {\begin{bmatrix} {P_{1}^{({\alpha,\beta})}\left( {\overset{\sim}{\lambda}}_{1} \right)} & \ldots & {P_{1}^{({\alpha,\beta})}\left( {\overset{\sim}{\lambda}}_{m_{t}} \right)} \\ \vdots & \; & \vdots \\ {P_{m_{t}}^{({\alpha,\beta})}\left( {\overset{\sim}{\lambda}}_{1} \right)} & \ldots & {P_{m_{t}}^{({\alpha,\beta})}\left( {\overset{\sim}{\lambda}}_{m_{t}} \right)} \end{bmatrix}.} & (64) \end{matrix}$

Where P_(n) ^((α,β))(x) are the Jacobi polynomials which form a complete orthogonal system in the interval [−1,1] with respect to the weighting function w(x)=(1−x)^(α)(1+x)^(β):

∫⁻¹ ¹ w(x)P _(n) ^((α,β))(x)P _(k) ^((α,β))(x)dx=a _(k,α,β)·δ_(kn),  (65)

Where for integers α and β

$\begin{matrix} {a_{k,\alpha,\beta} = {{\frac{2^{\alpha + \beta + 1}}{{2k} + \alpha + \beta + 1}2k} + \alpha + {\beta_{k}2k} + a + {\beta_{k + \alpha}^{- 1}.}}} & (66) \end{matrix}$

By the definition of the determinant we have

$\begin{matrix} {{{\prod\limits_{1 \leq i < j \leq m_{t}}\left( {{\overset{\sim}{\lambda}}_{i} - {\overset{\sim}{\lambda}}_{j}} \right)} = {C_{m_{t},m_{r},m}{\sum\limits_{\sigma \in S_{m_{t}}}{\left( {- 1} \right)^{{sgn}{(\sigma)}}{\prod\limits_{i = 1}^{m_{t}}{P_{\sigma {(i)}}^{({\alpha,\beta})}\left( {\overset{\sim}{\lambda}}_{i} \right)}}}}}},} & (67) \end{matrix}$

Where S_(m) _(t) is the set of permutations of 1, . . . , m_(t), sgn(σ) denotes the signature of the permutation σ and C_(m) _(t) _(,m) _(r) _(,m) is a constant picked up from transformation of the Vandermonde matrix (63) into (64). By applying (67) into (68) we get:

$\begin{matrix} \begin{matrix} {{f_{\overset{\sim}{\lambda}}\left( {{\overset{\sim}{\lambda}}_{1},\ldots \mspace{14mu},{\overset{\sim}{\lambda}}_{m_{t}}} \right)} = {{\overset{\sim}{K}}_{m_{t},m_{r},m}^{- 1}{\sum\limits_{\sigma_{1},{\sigma_{2} \in S_{m_{t}}}}\left( {- 1} \right)^{{{sgn}{(\sigma_{1})}} + {{sgn}{(\sigma_{2})}}}}}} \\ {{\prod\limits_{i = 1}^{m_{t}}{\left( {1 - {\overset{\sim}{\lambda}}_{i}} \right)^{\alpha}\left( {1 + {\overset{\sim}{\lambda}}_{i}} \right)^{\beta}{P_{\sigma_{1}{(i)}}^{({\alpha,\beta})}\left( {\overset{\sim}{\lambda}}_{i} \right)}{{P_{\sigma_{2}{(i)}}^{({\alpha,\beta})}\left( {\overset{\sim}{\lambda}}_{i} \right)}.}}}} \end{matrix} & (68) \end{matrix}$

Further integrating over {tilde over (λ)}₂, . . . , {tilde over (λ)}_(m) _(t) results

$\begin{matrix} {{f_{{\overset{\sim}{\lambda}}_{1}}\left( {\overset{\sim}{\lambda}}_{1} \right)} = {{\overset{\sim}{K}}_{m_{t},m_{r},m}^{- 1}{\sum\limits_{\sigma_{1},{\sigma_{2} \in S_{m_{t}}}}{\left( {- 1} \right)^{{{sgn}{(\sigma_{1})}} + {{sgn}{(\sigma_{2})}}}\left( {1 - {\overset{\sim}{\lambda}}_{1}} \right)^{\alpha}\left( {1 + {\overset{\sim}{\lambda}}_{1}} \right)^{\beta}}}}} & {(69)} \\ {{{P_{\sigma_{1}{(1)}}^{({\alpha,\beta})}\left( {\overset{\sim}{\lambda}}_{1} \right)}{P_{\sigma_{2}{(1)}}^{({\alpha,\beta})}\left( {\overset{\sim}{\lambda}}_{1} \right)}{\prod\limits_{i = 2}^{m_{t}}{a_{{\sigma_{1}{(i)}},\alpha,\beta} \cdot \delta_{{\sigma_{1}{(i)}}{\sigma_{2}{(i)}}}}}}} & \\ {= {{{{\overset{\sim}{K}}_{m_{t},m_{r},m}^{- 1}\left( {m_{t} - 1} \right)}!}{\sum\limits_{k = 1}^{m_{t}}{\left( {1 - {\overset{\sim}{\lambda}}_{1}} \right)^{\alpha}\left( {1 + {\overset{\sim}{\lambda}}_{1}} \right)^{\beta}{P_{k}^{({\alpha,\beta})}\left( {\overset{\sim}{\lambda}}_{1} \right)}^{2}{\prod\limits_{i \neq k}a_{i,\alpha,\beta}}}}}} & {(70)} \\ {{= {\frac{1}{m_{t}}{\sum\limits_{k = 1}^{m_{t}}{a_{k,\alpha,\beta}^{- 1}{P_{k}^{({\alpha,\beta})}\left( {\overset{\sim}{\lambda}}_{1} \right)}^{2}\left( {1 - {\overset{\sim}{\lambda}}_{1}} \right)^{\alpha}\left( {1 + {\overset{\sim}{\lambda}}_{1}} \right)^{\beta}}}}},} & {(71)} \end{matrix}$

Where the first equality follows from (65) and thus implies that σ₁(i)=σ₂(i) for all i. This results in the second equation while the third equality follows from (65) and the fact that f_({tilde over (λ)}) ₁ ({tilde over (λ)}₁) must integrates to unity. Turning back to λ₁ we get:

$\begin{matrix} {{{f_{{\overset{\sim}{\lambda}}_{1}}\left( {\overset{\sim}{\lambda}}_{1} \right)} = {\frac{1}{m_{t}}{\sum\limits_{k = 1}^{m_{t}}\; {{b_{k,\alpha,\beta}^{- 1}\left( {P_{k}^{({\alpha,\beta})}\left( {1 - {2\; \lambda_{1}}} \right)} \right)}^{2}{\lambda_{1}^{\alpha}\left( {1 - \lambda_{1}} \right)}^{\beta}}}}},} & (72) \end{matrix}$

Where

$\begin{matrix} {b_{k,\alpha,\beta} = {{\frac{1}{{2\; k} + \alpha + \beta + 1}2\; k} + \alpha + {\beta \; k\; 2\; k} + \alpha + {\beta \; k} + {\alpha^{- 1}.}}} & (73) \end{matrix}$

8 Proof of Theorem 5

For simplicity we assume m_(t)≦m_(r) (without loss of generality, since the outage probability is symmetric in m_(t) and m_(r)). Now, let us examine the outage probability which is given by (25):

$\begin{matrix} {{{P_{out}\left( {m_{t},m_{r},{m;{r\; {\log \left( {1 + {SNR}} \right)}}}} \right)} = {K_{m_{t},m_{r},m}^{- 1}{\int_{B}^{\;}{\prod\limits_{i = 1}^{m_{t}}\; {{{\lambda_{i}^{m_{r} - m_{t}}\left( {1 - \lambda_{i}} \right)}\ }^{m - m_{r} - m_{t}}{\prod\limits_{i < j}^{\;}\; {\left( {\lambda_{i} - \lambda_{j}} \right)^{2}{\lambda}}}}}}}},} & (74) \end{matrix}$

Where K_(m) _(t) _(,m) _(r) _(,m) is a normalizing factor and

$B = \left\{ {{{\lambda \text{:}\mspace{14mu} 0} \leq \lambda_{1} \leq \ldots \leq \lambda_{m_{t}} \leq 1},{{\prod\limits_{i = 1}^{m_{t}}\; \left( {1 + {{SNR} \cdot \lambda_{i}}} \right)} < \left( {1 + {SNR}} \right)^{r}}} \right\}$

is the set that describes the outage event.

Letting

λ_(i)=SNR^(−α) ^(i)   (75)

for i=1, . . . , m_(i) allows us to write

$\begin{matrix} {{{P_{out}\left( {m_{t},m_{r},{m;{r\; {\log \left( {1 + {SNR}} \right)}}}} \right)} = {{\log ({SNR})}^{m_{t}}K_{m_{t},m_{r},m}^{- 1}{\int_{B}^{\;}{\prod\limits_{i = 1}^{m_{t}}\; {{SNR}^{- {\alpha_{i}{({m_{r} - m_{t} + 1})}}}.}}}}}\ } & (76) \\ {\mspace{76mu} {{\cdot \left( {1 - {SNR}^{- \alpha_{i}}} \right)^{m - m_{r} - m_{t}}}{\prod\limits_{i < j}^{\;}\; {\left( {{SNR}^{- \alpha_{i}} - {SNR}^{- \alpha_{j}}} \right)^{2}{{\alpha}.}}}}} & (77) \end{matrix}$

Since 1+SNR^(1−α) ^(i) ≐SNR^((1−α) ^(i) ⁾ ⁺ , where (x)⁺=max{0,x}, we can describe the set of outage events by

$B = {\left\{ {{{\alpha \text{:}\mspace{14mu} \alpha_{1}} \geq \ldots \geq \alpha_{m_{t}} \geq 0},{{\sum\limits_{i = 1}^{m_{t}}\; \left( {1 - \alpha_{i}} \right)^{+}} < r}} \right\}.}$

Since the term log(SNR)^(m) ^(t) K_(m) _(t) _(,m) _(r) _(,m) ⁻¹ has no effect on the SNR exponent, i.e., satisfies

$\begin{matrix} {{{\lim\limits_{{SNR}\rightarrow\infty}\mspace{14mu} \frac{\log \left( {{\log ({SNR})}^{m_{t}}K_{m_{t},m_{r},m}^{- 1}} \right)}{\log \mspace{11mu} {SNR}}} = 0},} & (78) \end{matrix}$

We get

$\begin{matrix} {{P_{out}\left( {m_{t},m_{r},{m;{r\; {\log \left( {1 + {SNR}} \right)}}}} \right)} \doteq {\int{\prod\limits_{i = 1}^{m_{t}}\; {{SNR}^{- {\alpha_{i}{({m_{r} - m_{t} + 1})}}}.}}}} & (79) \\ {{\cdot \left( {1 - {SNR}^{- \alpha_{i}}} \right)^{m - m_{r} - m_{t}}}{\prod\limits_{i < j}^{\;}\; {\left( {{SNR}^{- \alpha_{i}} - {SNR}^{- \alpha_{j}}} \right)^{2}{{\alpha}.}}}} & (80) \end{matrix}$

Now, we note that

$\begin{matrix} {{P_{out}\left( {m_{t},m_{r},{m;{r\; {\log \left( {1 + {SNR}} \right)}}}} \right)} \leq {\int_{B}^{\;}{\prod\limits_{i = 1}^{m_{t}}\; {{SNR}^{- {\alpha_{i}{({m_{r} - m_{t} + 1})}}}{\prod\limits_{i < j}^{\;}\; {\left( {{SNR}^{- \alpha_{i}} - {SNR}^{- \alpha_{j}}} \right)^{2}\ {{\alpha}.}}}}}}} & (81) \end{matrix}$

It was proven that the right hand side of above satisfies:

$\begin{matrix} {{{\int{\prod\limits_{i = 1}^{m_{t}}\; {{SNR}^{- {\alpha_{i}{({m_{r} - m_{t} + 1})}}}{\prod\limits_{i < j}^{\;}\; {\left( {{SNR}^{- \alpha_{i}} - {SNR}^{- \alpha_{j}}} \right)^{2}{\alpha}}}}}} \doteq {SNR}^{- {f{(\alpha^{*})}}}},} & (82) \end{matrix}$

Where

$\begin{matrix} {{f(\alpha)} = {\sum\limits_{i = 1}^{m_{t}}\; {\left( {{2\; i} - 1 + m_{r} - m_{t}} \right)\alpha_{i}}}} & (83) \end{matrix}$

And

$\begin{matrix} {\alpha^{*} = {\arg \; \inf\limits_{\alpha \in B}\mspace{14mu} {{f(\alpha)}.}}} & (84) \end{matrix}$

By defining S_(δ)={α:α_(i)>δ∀i=1, . . . , m_(y)} for any δ>0, we can write

$\begin{matrix} {{{P_{out}\left( {m_{t},m_{r},{m;{r\; {\log \left( {1 + {SNR}} \right)}}}} \right)} \geq {\int_{B\bigcap S_{\delta}}^{\;}{\prod\limits_{i = 1}^{m_{t}}\; {{SNR}^{- {\alpha_{i}{({m_{r} - m_{t} + 1})}}}.}}}}\ } & (85) \\ {{\cdot \left( {1 - {SNR}^{- \alpha_{i}}} \right)^{m - m_{r} - m_{t}}}{\prod\limits_{i < j}^{\;}\; {\left( {{SNR}^{- \alpha_{i}} - {SNR}^{- \alpha_{j}}} \right)^{2}{\alpha}}}} & (86) \\ {{\geq {\left( {1 - {SNR}^{- \delta}} \right)^{m_{t}{({m - m_{r} - m_{t}})}}{\int_{B\bigcap S_{\delta}}^{\;}{\prod\limits_{i = 1}^{m_{t}}\; {{SNR}^{- {\alpha_{i}{({m_{r} - m_{t} + 1})}}}.}}}}}\ } & (87) \\ {\cdot {\prod\limits_{i < j}^{\;}\; {\left( {{SNR}^{- \alpha_{i}} - {SNR}^{- \alpha_{j}}} \right)^{2}{\alpha}}}} & (88) \\ {{\doteq {SNR}^{- {f{(\alpha_{\delta}^{*})}}}},} & (89) \end{matrix}$

Where

$\begin{matrix} {\alpha_{\delta}^{*} = {\arg \; \inf\limits_{\alpha \in {B\bigcap S_{\delta}}}\mspace{14mu} {{f(\alpha)}.}}} & (90) \end{matrix}$

Using the continuity of f, α_(δ)* approaches α⁺ as δ goes to zero and we can conclude that

P _(out)(m _(t) ,m _(r) ,m;r log(1+SNR))≐SNR ^(−f(α*)).  (91)

This result was obtained for the outage probability in Rayleigh channel. From here one can continue as was presented, showing that the error probability is dominated by the outage probability at high SNR for l≧m_(t)+m_(r)−1 (these proofs rely on (91) without making any assumptions on the channel statistics, therefore are true also for the Jacobi channel). 

We claim:
 1. A method for transmitting information over a communication channel coupled between a transmitter and a receiver, the method comprises: transmitting, by the transmitter using a first number (Mt) of transmission paths, multiple transmitter data streams and a transmitter feedback stream; wherein the transmitter feedback stream comprises transmitter feedback symbols calculated in response to receiver feedback stream symbols; receiving, by the receiver during multiple points in time and using a second number (Mr) of reception paths, a plurality of received streams that represent the multiple transmitter data streams and the transmitter feedback stream; estimating by the receiver, transfer matrixes of the communication channel that correspond to the multiple points in time; transmitting by the receiver and over a feedback channel, a receiver feedback stream indicative of the transfer matrixes; wherein the receiver feedback stream comprises the receiver feedback stream symbols; performing at least one additional transmission of a transmitter data symbol of the transmitter data streams to guarantee that the receiver is capable of reconstructing the transmitter data symbol with a desired certainly to provide a reconstructed transmitter data symbol; and reconstructing, by the receiver, the transmitter data streams in response to the reconstructed transmitter data symbol, the plurality of receiver data streams and the received transmitter feedback stream; wherein each one of the first number (Mt) and the second number (Mr) is smaller than a maximal number of paths (M) supported by the communication channel.
 2. The method according to claim 1, wherein the multiple transmitter data streams and the transmitter feedback stream are transmitted concurrently.
 3. The method according to claim 1, wherein a number of the transmitter data streams does not exceed Mt+Mr−Mc.
 4. The method according to claim 1, wherein the channel is a multimode optic fiber and the transmission paths are implemented by multi-modes of the transmitter.
 5. The method according to claim 1, wherein the channel is an optical fiber that has multiple cores and the transmission paths are implemented by the multiple cores.
 6. The method according to claim 1, wherein the receiver feedback stream comprises the transfer matrixes of the communication channel during the receiving.
 7. The method according to claim 1 wherein the transfer matrixes belong to unitary matrixes; wherein the receiver feedback stream comprises parts of the unitary matrixes.
 8. The method according to claim 7, wherein the receiver feedback stream comprises parts of the unitary matrixes that differ from the transfer matrixes of the communication channel during the receiving but facilitate a reconstruction of the transfer matrixes.
 9. The method according to claim 8, wherein the parts of the unitary matrixes are smaller than the transfer matrixes.
 10. The method according to claim 1, comprising transmitting, at a certain time slot, multiple transmitter data symbols of the multiple transmitter data streams and a transmitter feedback symbol of the transmitter feedback stream, wherein the transmitter feedback symbol is responsive to a transmitter data symbol of the multiple transmitter data streams and to a receiver feedback symbol received during a time slot that precedes the certain time slot.
 11. The method according to claim 1, comprising transmitting, at a certain time slot, multiple transmitter data symbols of the multiple transmitter data streams and a transmitter feedback symbol of the transmitter feedback stream, wherein the transmitter feedback symbol is responsive to a product of (a) transmitter data symbol of the multiple transmitter data streams and (b) a receiver feedback symbol received during a time slot that precedes the certain time slot.
 12. A system for transmitting information, the system comprises a receiver, a transmitter; wherein the receiver and the transmitter are coupled to a communication channel; wherein the transmitter is arranged to transmit, using a first number (Mt) of transmission paths, multiple transmitter data streams and a transmitter feedback stream; wherein the transmitter feedback stream comprises transmitter feedback symbols calculated in response to receiver feedback stream symbols; wherein the receiver is arranged to: receive, during multiple points in time and using a second number (Mr) of reception paths, a plurality of received stream that represent the multiple transmitter data streams and the transmitter feedback stream; estimate transfer matrixes of the communication channel that correspond to the multiple points in time; transmit over a feedback channel, a receiver feedback stream indicative of the transfer matrixes wherein the receiver feedback stream comprises the receiver feedback stream symbols; wherein the transmitter is further arranged to perform at least one additional transmission of a transmitter data symbol of the transmitter data streams to guarantee that the receiver is capable of reconstructing the transmitter data symbol with a desired certainly to provide a reconstructed transmitter data symbol; and wherein the receiver is further arranged to reconstruct the transmitter data streams in response to the reconstructed transmitter data symbol, the plurality of receiver data streams and the received transmitter feedback stream; wherein each one of the first number (Mt) and the second number (Mr) is smaller than a maximal number of paths (M) supported by the communication channel.
 13. The system according to claim 12, wherein the transmitter is arranged to transmit the multiple transmitter data streams and the transmitter feedback stream concurrently.
 14. The system according to claim 12, wherein a number of the transmitter data streams does not exceed Mt+Mr−Mc.
 15. The system according to claim 12, wherein the communication channel is a multimode optic fiber and the transmission paths are implemented by multi-modes of the transmitter.
 16. The system according to claim 12, wherein the communication channel is an optical fiber that has multiple cores and the transmission paths are implemented by the multiple cores.
 17. The system according to claim 12, wherein the receiver feedback stream comprises the transfer matrixes of the communication channel during the receiving.
 18. The system according to claim 12, wherein the receiver feedback stream comprises parts of the unitary matrices that differ from the transfer matrixes of the communication channel during the receiving but facilitate a reconstruction of the transfer matrixes.
 19. The system according to claim 18, wherein the parts of the unitary matrixes are smaller than the transfer matrixes.
 20. The system according to claim 12, wherein the transmitter is arranged to transmit, at a certain time slot, multiple transmitter data symbols of the multiple transmitter data streams and a transmitter feedback symbol of the transmitter feedback stream, wherein the transmitter feedback symbol is responsive to a transmitter data symbol of the multiple transmitter data streams and to a receiver feedback symbol received during a time slot that precedes the certain time slot.
 21. The system according to claim 12, wherein the transmitter is arranged to transmit, at a certain time slot, multiple transmitter data symbols of the multiple transmitter data streams and a transmitter feedback symbol of the transmitter feedback stream, wherein the transmitter feedback symbol is responsive to a product of (a) transmitter data symbol of the multiple transmitter data streams and (b) a receiver feedback symbol received during a time slot that precedes the certain time slot.
 22. A receiver, comprising: a receiving module arranged to: receive, during multiple points in time and using a second number (Mr) of reception paths, a plurality of receiver data streams and a received transmitter feedback stream that represent multiple transmitter data streams and a transmitter feedback stream; receive at least one additional transmission of a transmitter data symbol of the transmitter data streams; wherein the multiple transmitter data streams are transmitted by a transmitter coupled to the receiver via a communication channel, using a first number (Mt) of transmission paths; wherein the transmitter feedback stream comprises transmitter feedback symbols calculated in response to receiver feedback stream symbols; a transfer matrix estimator that is arranged to estimate transfer matrixes of the communication channel that correspond to the multiple points in time; wherein the transfer matrixes are portions of unitary matrixes, each unitary matrix is associated with a point in time of the multiple points in time; a receiver feedback module arranged to transmit over a feedback channel, a receiver feedback stream indicative of parts of the unitary matrixes; wherein the receiver feedback stream comprises the receiver feedback stream symbols; a reconstruction module that is arranged to: reconstruct, in response to the reception of the at least one additional transmission of the transmitter data symbol, the transmitter data symbol to provide a reconstructed transmitter data symbol; and reconstruct the transmitter data streams in response to the reconstructed transmitter data symbol, the plurality of receiver data streams and the received transmitter feedback stream; wherein each one of the first number (Mt) and the second number (Mr) is smaller than a maximal number of paths (M) supported by the communication channel.
 23. A transmitter, comprising: a transmission module that is arranged to transmit, using a first number (Mt) of transmission paths, multiple transmitter data streams and a transmitter feedback stream; a transmitter feedback symbol calculator arranged to calculate transmitter feedback stream symbols of the transmitter feedback stream in response to receiver feedback stream symbols; wherein the receiver feedback stream is sent to the transmitter over a feedback channel by a receiver using a second number (Mr) of reception paths; wherein the receiver and the transmitter are coupled to a communication channel; wherein the receiver feedback stream is indicative of unitary matrices, each unitary matrix comprises a transfer matrix of the communication channel at a certain point in time out of multiple points in time during which the receiver received a plurality of receiver data streams and a received transmitter feedback stream that represent the multiple transmitter data streams and the transmitter feedback stream; a transmission module that is further arranged to perform at least one additional transmission of a transmitter data symbol of the transmitter data streams to guarantee that the receiver is capable of reconstructing the transmitter data symbol with a desired certainly to provide a reconstructed transmitter data symbol; and wherein each one of the first number (Mt) and the second number (Mr) is smaller than a maximal number of paths (M) supported by the communication channel. 